Cooccur 2.docx

0 downloads 0 Views 621KB Size Report
ambiguous word in a sentence (for comprehensive reviews, see Gorfein, 1989, 2001; ... model, which holds that top-down information can be used to access only the ... A complete account of lexical ambiguity resolution should of course specify the mechanism(s) .... Words that fall outside of this window do not contribute.
Running Head: Co-occurrence and Ambiguity Resolution

Lexical Co-occurrence and Ambiguity Resolution Jeffrey Witzel1, 2 and Kenneth Forster2 1

University of Texas at Arlington 2 University of Arizona

Corresponding Author: Jeffrey Witzel Department of Linguistics and TESOL 132 Hammond Hall Box 19559 Arlington, TX 76019-0559 USA Fax #: 817-272-2731 Phone #: 817-272-3133 [email protected]

Published in Language, Cognition and Neuroscience 29:2 (2014), 158-185. doi 10.1080/01690965.2012.748925

1

Abstract This study investigates the possible influence of lexical co-occurrence on lexical ambiguity resolution in sentence contexts. Lexical co-occurrence refers to similarity between the co-occurrence vectors of words, such that if two words have similar profiles of occurrence with other words, they are said to have a strong co-occurrence relationship. The present study examines whether lexical ambiguity resolution can be biased by the prior presentation of a word that shares a strong co-occurrence relationship with an ambiguous word under one of its meanings, despite the absence of plausibility support. Two “maze” word-by-word reading experiments examined highly implausible/anomalous sentences with balanced homographs. In sentences in which the ambiguous word (e.g., bat) was preceded by a biasing word with which it shares a strong co-occurrence relationship (e.g., umpire), (i) response times (RTs) to the ambiguous word were facilitated, and (ii) garden-path effects were found when subsequent (disambiguating) information was incongruent with the biased meaning (e.g., The umpire tried to swallow the bat but its wings got stuck in his throat). Additional experiments showed that these biasing effects resist explanation in terms of a passive process of spreading activation. Furthermore, an eye-tracking experiment revealed a pattern of results comparable to that of the maze task experiments, indicating that these effects are not artifacts of the maze procedure. These results are taken to support a heuristic for lexical ambiguity resolution that is driven by relatively low-level intralexical connections based on lexical co-occurrence.

2

Acknowledgements We would like to thank Stacey Claspill, Leslie Darnell, Katherine Plattner, and Devin St. John for assisting with data collection on the eye-tracking experiment. Preliminary reports on this research were presented at the International Conference on the Mental Lexicon (Banff, Canada, October 2008) and the CUNY Conference on Human Sentence Processing (Davis, CA, March 2009).

3

A major question in psycholinguistics is how the appropriate meaning is selected for an ambiguous word in a sentence (for comprehensive reviews, see Gorfein, 1989, 2001; Simpson, 1984, 1994; Small, Cottrell, & Tanenhaus, 1988). Indeed, it is remarkable how efficiently this meaning selection is accomplished. The apparent ease of this process is underscored by the fact that we are often unaware that more than one meaning is available for a word -- which likely accounts for the surprise we experience when confronted with these multiple meanings in puns and other wordplay. Of course, all models of lexical ambiguity resolution hold that the context leading up to the ambiguous word influences the meaning selection process. To date, much of the debate regarding these context effects has focused on how ambiguous words are accessed. Specifically, the emphasis has been on the time course of these effects, with the idea this information can be used to test between the following models of meaning access: (i) the autonomous access model, under which the meanings of ambiguous words are accessed exhaustively by a modular lexical processor (Onifer & Swinney, 1981; Seidenberg, Tanenhaus, Leiman, & Bienkowski, 1982; Swinney, 1979; Tanenhaus, Leiman, & Seidenberg, 1979); (ii) the selective access model, which holds that top-down information can be used to access only the appropriate meaning of an ambiguous word (Simpson, 1981; Tabossi, Colombo, & Job, 1987; Tabossi, 1988; Van Petten and Kutas, 1987); and (iii) the reordered access model, which posits exhaustive access to the meanings of ambiguous words, with context influencing their relative levels of activation/availability (Binder & Rayner, 1998; Dopkins, Morris, & Rayner, 1992; Duffy, Kambe, & Rayner, 2001; Duffy, Morris, & Rayner, 1988; Rayner & Duffy, 1986; Sheridan, Reingold, & Daneman, 2009). Less attention has been focused on the question of precisely how the preceding context exerts its influence. A complete account of lexical ambiguity resolution should of course specify the mechanism(s) whereby prior context influences meaning selection. Consider the models sketched above. The autonomous access model holds that preceding context plays a role in meaning selection only after semantic retrieval has been completed. Under this model, it would appear necessary to compute all possible interpretations of the sentence and then select the meaning that fits with the most plausible of these interpretations. However, since most words are ambiguous to some degree, this could involve a 4

rather large number of possible interpretations that must be compared for plausibility -- so many that one begins to doubt whether this process could be rapid enough for on-line comprehension, and one is strongly tempted to find another mechanism that might simplify the process. The issue of how prior context exerts its influence is also important for the selective and reordered access models. Again, under both of these accounts, prior context automatically pre-activates the appropriate meaning of an ambiguous word such that it can become unambiguous (in the selective access model) or functionally unambiguous (in the reordered access model) under certain conditions. While these approaches correspond well with subjective experience, they are incomplete without clear specification of the actual mechanism that is responsible for meaning pre-activation. The purpose of this study is to present evidence for a relatively simple computational mechanism that could achieve ambiguity resolution online. Specifically, we propose that all meanings of an ambiguous word are initially activated (consistent with both the autonomous access model and reordered access model), with subsequent selection being triggered by a fast-acting, low-level heuristic based on intralexical connections. Furthermore, based on evidence from semantic categorization experiments (Forster, 2006; Forster & Hector, 2002), we argue that these connections are based on lexical co-occurrence statistics, i.e., information about distributional similarities among words. In support of this proposal, we show that meaning selection can be biased by the prior presentation of a word that shares distributional similarities with an ambiguous word under one of its meanings, even in the absence of overall plausibility support for this meaning in the sentence context. The language processing system must take into consideration how well a given meaning of an ambiguous word fits into the mental model that is generated for a sentence (or larger discourse) at some stage of comprehension. However, this may be a relatively late process, a final check, so to speak. Prior to this process, we propose that there is a purely heuristic device that attempts to select the most likely meaning based on information from a level well below that of the interpretation of the sentence as a whole. The lower-level cues to meaning of particular interest in the present study are those provided by connections between ambiguous words and words in the preceding context. The influence of such cues was first noted in a set of cross-modal priming experiments by Seidenberg and colleagues (Seidenberg et 5

al., 1982). In this study, immediately after listening to clauses like Although the farmer bought the straw, participants were facilitated in their naming of contextually appropriate targets (hay), but not in their naming of contextually inappropriate targets (sip) (Seidenberg et al., 1982, Experiment 2). This selective facilitation was attributed to the semantic/associative connection between the ambiguous word under one of its meanings (straw under its ‘hay’ meaning) and a word in the immediately preceding context (farmer). In contrast, ambiguous words presented in contexts that were constraining by virtue of syntactic restrictions (They bought a rose vs. They all rose) or global plausibility/pragmatic considerations (You should have played the spade vs. Go to the store and buy a spade) facilitated the naming of targets related to both their contextually appropriate and inappropriate meanings. Seidenberg and colleagues interpreted these findings to indicate that whereas other contextual constraints allow for exhaustive access to the meanings of ambiguous words, contexts containing words that are semantically or associatively related to an ambiguous word under one of its meanings can lead to selective meaning access -- particularly if the related words occur close together. They further maintained that selective meaning access under these conditions is consistent with modular approaches to lexical ambiguity processing because the meaning of the ambiguous word can be derived from intralexical associations, or from relations among words that are stored in the lexical module itself (Forster, 1979; for comparable interpretations of these findings, see Prather & Swinney, 1988; Tanenhaus & Lucas, 1987). It has since been noted, however, that it is unclear whether these intralexical connections actually prevent access to the unbiased meaning or simply facilitate the meaning selection process (Tanenhaus & Lucas, 1987). Another indication of the possible influence of lexical-level connections on ambiguity resolution comes from a cross-modal priming study by Tabossi (1988; see also Tabossi, 1987). In this study, when an ambiguous word was preceded by a pragmatically constraining context (The man had to be at five o’clock at the port for a very important meeting), priming was found for lexical decision targets related to both its contextually appropriate (sea) and inappropriate (liqueur) meanings; however, when it was preceded by a semantically constraining context (The violent hurricane did not damage the ships which were in the port, one of the best equipped along the coast), there was priming only for targets related to 6

the contextually appropriate meaning. These results were interpreted to indicate that selective access can be obtained when context highlights characteristic semantic features of an ambiguous word’s dominant meaning. However, it is important to note that in semantically constraining sentences, the ambiguous word was often preceded by related words (e.g., ships). It is therefore difficult to determine the extent to which this selective facilitation effect was driven by highlighted semantic features or by lexical connections, as in the Seidenberg et al. (1982) study discussed above. These early findings have since been extended such that connections among lexical items figure prominently in a range of models of lexical ambiguity resolution (see e.g., Hirst, 1988; Prather & Swinney, 1988). Furthermore, these connections also provide the (implicit) basis for context manipulations in many experiments related to the processing of ambiguous words (see e.g., Dopkins et al., 1992; Forster, Guerrera, & Eliot, 2009; Reder, 1983; Sheridan et al., 2009). The present study sought to build on these indications of the importance of intralexical connections and to offer an explanation of how lexical ambiguity resolution -- or at least a first pass at meaning selection -- might occur in a modular language processing system (see e.g., Fodor, 1983; Forster, 1979). One way that these connections might be represented is in terms of distributional similarities between words -- information that a lexical processing module could have access to (and could keep track of) as it presumably does for other relevant distributional information, such as word frequency. The basis for this idea is provided by the "Links" model (Forster, 2006; Forster & Hector, 2002), a recent attempt to explain how the lexical processor can have access to information about likely semantic properties very early in the recognition process. An interesting test of the influence of these distributionally-based connections is whether they can bias meaning selection in the absence of plausibility support. It is important to note that no study to date has examined whether intralexical connections can influence lexical ambiguity resolution independently of overall plausibility. The sentences tested in the Seidenberg et al. (1982) study described above are illustrative of the potential confound between these sources of contextual information. With reference again to the example item Although the farmer bought the straw, while there is a clear semantic/associative link between farmer and straw under its ‘hay’ meaning, it is 7

also the case that a farmer is much more likely to buy some hay than a drinking straw. It therefore remains unclear whether the apparent influence of intralexical connections on meaning selection is actually due to some interaction of this factor with plausibility. In order to address this issue, the present study investigated whether intralexical connections can bias ambiguity resolution even when these connections do not lead to more plausible interpretations. Evidence for the influence of such connections could be taken to support a heuristic for lexical ambiguity resolution that relies on the kind of low-level, intralexical links that might be provided by distributional similarities between words.

Lexical co-occurrence and models of lexical semantics In order to properly describe this model of lexical ambiguity resolution (as well as the experiments designed to test this model), it is first necessary to define what is meant by "distributional similarities" between words. Several recent models of the lexical semantic system, including Latent Semantic Analysis (LSA; Landauer & Dumais, 1997), the Hyperspace Analogue to Language (HAL; Lund & Burgess, 1996), and the Correlated Occurrence Analogue to Lexical Semantics (COALS; Rohde, Gonnerman, & Plaut, 2005), have attempted to derive word meanings and specify the semantic relationships among words through the use of lexical co-occurrence information. Under these models, lexical co-occurrence roughly refers to the distributional similarity among words -- or to how often words occur with similar sets of other words. More specifically, these models represent the meanings of words with high-dimensional vectors based on the co-occurrence of words in unlabeled corpora. Each vector essentially denotes how often a word occurs in a specified window with every other word in the corpora. The semantic relationship between words is then established by calculating some measure of distance or similarity between their respective vectors (where less distance/more similarity indicates a closer semantic relationship). Although the LSA, HAL, and COALS models rely on the same underlying logic sketched above, these models differ in terms of how word vectors are established and how distance measures are calculated (see Rohde et al., 2005, for more on the similarities and differences among these models). More importantly, these models have been shown to differ in terms of (i) how well their 8

respective distance/similarity measures correspond to actual semantic similarity ratings and (ii) how well these measures relate to decisions on multiple choice vocabulary tests (in which correct answer choices are semantically similar to targets). While the LSA and HAL models have been shown to perform variably on these empirical tests, the COALS model has been shown to perform consistently well (Rohde et al., 2005). For this reason, the COALS model is of particular interest in the present study. In this model, word vectors are sets of weighted co-occurrence values established within a ramped 4-word window on 1.2 billion word corpus. The ramped 4-word window weights words that occur closer to the word of interest more highly in its co-occurrence values. That is, words that occur adjacent to the word of interest (either to the left or right) contribute most to this value, whereas words that are four words displaced contribute the least. Words that fall outside of this window do not contribute at all to the co-occurrence calculation. The corpus used for this model consists of largely unfiltered text from the Usenet newsgroup -- the same data source used in the HAL model. An important feature of the COALS system -- and, in fact, one of the key differences between this model and HAL -- is that only open-class words are included in its co-occurrence counts. This property of the COALS system has two effects: (i) It removes the undue influence of highly frequent words on its co-occurrence calculations, and (ii) it provides vectors that more accurately reflect lexical semantics, as opposed to some combination of lexical semantics and syntactic category. Finally, under the COALS model, semantic similarity is computed in terms of correlations between vectors. If there is a strong correlation between the vectors of two words, these words are considered semantically similar.

Lexical co-occurrence, links, and ambiguity resolution Lexical co-occurrence, as defined above, has contributed to a number of psycholinguistic investigations (see e.g., Burgess & Lund, 1997; Foltz, Kintsch, & Landauer, 1998; Kintsch, 2000; see Rohde et al., 2005, for more on the applications of these models). For its part, the COALS model has recently been invoked to explain several curious findings from semantic categorization experiments, the collection of which is referred to as the “turple effect” (Forster, 2006). In these experiments, inflated 9

decision times are obtained for nonwords that are close neighbors of category exemplars -- for example, turple in an ANIMAL categorization task -- but not for nonwords that are neighbors of non-exemplars -for example, tabric in the same task. (For similar results with word neighbors of exemplars, e.g., leotard in an ANIMAL categorization task, see Rodd, 2004.) Interestingly, this effect does not appear to be influenced by the number of neighbors of the nonword, as would be the case in lexical decision. Furthermore, this effect obtains only if the category of interest is a relatively small, “natural” category such as ANIMAL. If a very broad category such as PHYSICAL OBJECT is used, then there is interference for nonword neighbors of both exemplars (e.g., himmer) and nonexemplars (e.g,. travity) (Forster 2006; see also Pecher, Zeelenberg, & Wagenmakers, 2005). These findings have been interpreted in terms of the “Links” model (Forster 2006; Forster & Hector, 2002). The increase in decision time for turple in ANIMAL categorization presumably occurs because this nonword activates the lexical entry for turtle as a candidate. This activation then forces the lexical system to execute a post-access spelling check to determine whether the input was in fact turtle, which would require a “Yes” response. As for nonwords like tabric in the same task (or for turple with a different category), the system apparently “knows” that this spelling check is not necessary. That is, it recognizes that even if the input were really fabric, a “No” response would still be required. Therefore, even at a very early stage of processing, some information about meaning seems to be available. This information also appears to come for free. If the difference between turple and tabric was the result of activating the semantic properties of turtle and fabric, then tabric should take longer to respond to than a nonword that does not closely resemble a real word, such as firtan. But this is not the case (Forster, 2006). The Links model explains this set of findings by positing that each lexical entry has a pointer, or “link”, to each semantic field that it belongs to. It is these links, rather than the full (or even partial) semantic properties of lexical items, that are available during the initial stage of word recognition, when inputs are orthographically matched to candidate lexical entries. If a candidate lexical entry has a link to a semantic field that corresponds to the category of interest, then a more careful check of the input letter string will be initiated. On the other hand, if a candidate entry lacks a link to an appropriate semantic field, then it 10

can simply be skipped. Given this model, it is not surprising that this selective verification can occur only when the category of interest is a relatively circumscribed, “natural” semantic category (like ANIMAL). This is because semantic fields that are too wide-ranging (e.g., PHYSICAL OBJECT) or hopelessly ad hoc (e.g., BIGGER THAN A BRICK) would be unlikely to exist and, thus, links to these fields would not be available. The crucial assumption here is that the semantic fields to which these links point are established on the basis of lexical co-occurrence patterns. These fields can be conceived of as clusters containing meaning entries for words with relatively highly correlated co-occurrence vectors. But with very broad categories, the co-occurrence profiles are too disparate to form any coherent pattern. One role that these links might play in language processing is to assist in lexical ambiguity resolution. That is, upon encountering an ambiguous word in a sentence, the language processor could initiate a first pass at meaning selection by searching for a “matching” link from a word in the prior context. This procedure might work as follows: When an ambiguous word is first accessed, its lexical entry is found to contain links to distinct clusters of meaning entries for words with tightly correlated cooccurrence vectors -- i.e., to distinct semantic fields. The links associated with the previous words in the sentence are then searched in an attempt to find another link to one of these clusters. When a matching link is found, the meaning of the ambiguous word in that cluster is activated and made available to higher-order comprehension systems, while its other meanings remain relatively dormant. It should be noted that this is a non-selective model of lexical ambiguity resolution in the sense that context (i.e., one or more matching links) exerts its effects only after the ambiguous word’s multiple meanings have been indexed. It should also be emphasized that this is a heuristic method only, and would not be guaranteed to be correct every time. However, in combination with other heuristics like picking the most frequently used meaning (Binder & Rayner, 1998; Dopkins et al., 1992; Duffy et al., 1988; Forster & Bednall, 1976; Forster et al., 2009; Hogaboam & Perfetti, 1975; Rayner & Duffy, 1986; Rayner & Frazier, 1989; Simpson, 1981; Simpson & Burgess, 1985; Sheridan et al., 2009; Tabossi et al., 1987), this method might achieve a fairly high success rate.

11

EXPERIMENT 1 A seemingly straightforward prediction of this Links-based heuristic is that online meaning selection for an ambiguous word should be biased by the prior presentation of a word that is distributionally similar to the ambiguous word under one of its meanings, even in the absence of plausibility support for that meaning. That is, this heuristic should operate independently of higher-order considerations of overall plausibility. Experiment 1 tested this prediction. The ambiguous words of interest were balanced (often also called equibiased or unbiased) homographs. These words were tested so that differences in relative meaning frequency (RMF) would not cancel out any meaning selection bias based on common lexical links. The sentences that contained these words were designed such that they would not provide plausibility support for a given meaning of the ambiguous word when this word first appeared. This was accomplished by making the sentence context up to and including the ambiguous word (under any of its meanings) semantically incongruous (i.e., wholly implausible or anomalous). This was done in order to allow for a more accurate indication of the independent influence of intralexical connections on ambiguity resolution. The experiment thus examined sentences like the following: (1a) biased-congruent The umpire tried to swallow the bat but its handle got stuck in his throat. (1b) biased-incongruent The umpire tried to swallow the bat but its wings got stuck in his throat. (1c) control-congruent The nurse tried to swallow the bat but its handle got stuck in her throat. (1d) control-incongruent The nurse tried to swallow the bat but its wings got stuck in her throat. Since umpire and bat have similar co-occurrence profiles (presumably derived from texts devoted to baseball), the hypothesis is that bat in (1a) and (1b) will be interpreted as referring to a baseball bat, despite the implausibility of an umpire trying to swallow one. In sentences (1c) and (1d), however, the initial noun is nurse, which does not have a similar profile to bat and hence should not bias the 12

interpretation of this word. If meaning selection is biased in this way, bat should be integrated into the (albeit ridiculous) sentence context in (1a) and (1b) more quickly than in (1c) and (1d). Note also that handle in (1a) is consistent with the biased interpretation of bat, whereas wings in (1b) is inconsistent with this interpretation. Thus, wings in (1b) should force a reevaluation of bat, producing a garden-path effect -- that is, longer response times relative to handle in (1a), all other things being equal. As a control, the same comparison can be made between (1c) and (1d), with the prediction being that a comparable difference should not obtain between the disambiguating words handle and wings in these sentences. In order to measure word-by-word reading times, a variant of self-paced reading was used: the maze task (Forster, 2010; Forster et al., 2009; Nicol, Forster, & Veres, 1997; Witzel, Witzel, & Forster, 2012). In this task, each sentence is presented as a sequence of choices between two alternatives, one of which is a possible continuation of the sentence. In one version of this task, the choice is between a word and a legal nonword, and the participant must choose which of the letter strings is a word as quickly and accurately as possible. This version of the maze task is referred to as the lexicality (L) version, or the Lmaze. In another version of this task, both alternatives are words, but only one is a grammatical continuation of the sentence. In this case, the participant's task is to choose which alternative best continues the sentence as quickly and accurately as possible. This version of the maze task is referred to as the grammaticality (G) version, or the G-maze. In both versions of this task, if the subject makes the correct choices throughout the sequence, the selected words form a sentence. For example, successfully navigating through the sequences of choices in the L-maze and G-maze items shown in Figure 1 yields the sentence The dog quietly chewed on the bone. (FIGURE 1 ABOUT HERE) Forster et al. (2009; see also Forster, 2010; Witzel et al., 2012) have argued that the G-maze forces readers into a strictly incremental parsing mode, in which each word must be fully integrated into the developing sentence structure before moving onto the next frame. On the face of it, the L-maze would not seem to require the same incremental integration. In fact, this version of the task could be accomplished on purely lexical grounds, without any reference to sentence structure or meaning. Despite 13

this apparent difference, similar sentence processing effects have been revealed in both the G-maze and L-maze (see e.g., Forster et al., 2009; Witzel et al., 2012). Therefore, even though the different versions of this task might place different demands on the participants, they nevertheless seem to tap into comparable processing mechanisms. The initial experiment in this study used the L-maze. This version was used in order to get a preliminary indication of how lexical ambiguities are processed in these contexts with a relatively easy task.

Method Participants. Sixty-four (64) undergraduate psychology students at the University of Arizona participated in the experiment for course credit. In this and all subsequent experiments, participants were native speakers of English. Materials and design. The experimental items consisted of 48 sets of sentences similar to the example set (1a)-(1d). See Appendix A for the complete list of experimental items. Each set was constructed around (i) an ambiguous word (e.g., bat), (ii) a biasing word (e.g., umpire), and (iii) a control word (e.g., nurse). The biasing word had a similar co-occurrence profile with the ambiguous word, while the control word did not. Sentences in which a biasing word preceded the ambiguous word (as in (1a) and (1b)) will be referred to as biased sentences; sentences in which the control word preceded it (as in (1c) and (1d)) will be referred to as control sentences. The biasing/control word was located toward the beginning of the sentence, and was usually part of a sentence-initial subject noun phrase. The ambiguous word was located mid-sentence (with on average 2.83 words intervening between it and the biasing/control word), and was usually the final word of the first clause. As mentioned above, the sentence context up to and including the ambiguous word (under any of its meanings) was semantically incongruous. This incongruity was created by using inappropriate verb-argument combinations (for comparable manipulations, see e.g. Joseph, Liversedge, Blythe, White, Gathercole, & Rayner, 2008; Rayner, Warren, Juhasz, & Liversedge, 2004). In the example set (1a)-(1d), for instance, this incongruity arose due to a mismatch between the verb and its object -- or more specifically, due to the virtual impossibility of a person swallowing either 14

an animal bat or a baseball bat. In other items, this incongruity was due to mismatches between the subject and the verb (The plaster/metal jumped....) or among the verb and both of its arguments (The envelope/backpack lectured to the seal....). The material following the ambiguous word was congruent with only one of its meanings. In sentences like (1a), this disambiguating information was congruent with the ambiguous word under the meaning connected to the biasing word. In sentences like (1b), however, the disambiguating information was incongruent with this meaning. These sentences, therefore, correspond to a 2x2 factorial design with the factors bias (biased, control) and congruence (congruent, incongruent). The ambiguous words in these items were roughly equibiased based on the Alberta Homograph Norms (Twilley, Dixon, Taylor, & Clark, 1994). In these norms, the RMF for the different meanings of an ambiguous word is expressed as the proportion of associations connected with each of its meanings. Based on these norms, the meanings of the ambiguous words in congruent sentences [sentence types (1a) and (1c)] had mean RMF of .38, while the meanings in incongruent sentences [sentence types (1b) and (1d)] had a mean RMF of .44. Although this difference in RMF values was significant, F (1, 47) = 4.95, p < .05, it was deemed acceptable because (if anything) it would work against garden-pathing in sentences like (1b) -- the sentence type on which reanalysis costs were predicted. See Appendix B for a complete list of the ambiguous words used in experimental sentences, along with the RMF values for their relevant meanings. Co-occurrence information was obtained from the COALS web interface (http://dlt4.mit.edu/~dr/COALS/). This site was used to generate the co-occurrence correlations for the biasing words and the control words with the ambiguous words. In biased sentences like (1a) and (1b), the mean co-occurrence correlation between the biasing and ambiguous words was 0.22 (a highly significant and relatively large correlation). In control sentences like (1c) and (1d), the mean co-occurrence correlation between the control and ambiguous words was -.02. The difference between these correlations was highly significant, F (1, 47) = 174.60, p < .001. Appendix C provides the co-occurrence correlations

15

for the biasing word and the control word with the ambiguous word for each of the experimental sentence sets. In order to ensure that the experimental task was sensitive to sentence-level processing, 48 filler items were also tested. These items were made up of both grammatical sentence sequences and scrambled sequences (e.g., The happiness climbed the meeting very quickly and usefully. vs. The quickly the usefully meeting very climbed happiness and.). The same manipulation was included in an L-maze experiment carried out by Forster et al. (2009). In that experiment, mean response times were found to be longer for scrambled sequences than for grammatical sequences, indicating that the L-maze was sensitive to sentence-level processing effects. These filler items were also semantically incongruous and were roughly matched with the experimental items in terms of length and complexity.1 The experimental and filler items were distributed into four counterbalanced lists. Each word in the experimental and filler items (except for the first word; see below) was paired with an orthotactically licit nonword of the same length. The pairing of words with nonwords was the same in each of the conditions for the experimental and filler items. Procedure. The experiment was run using DMDX software developed by J.C. Forster and K.I. Forster at the University of Arizona (Forster & Forster, 2003). Items were presented in black letters on a white background. Each item consisted of a series of frames, the first of which was [The …]. Each subsequent frame contained a word and a nonword presented side by side. Participants were instructed to choose the word in each frame as quickly and as accurately as possible by pushing the corresponding left or right button. Correct and incorrect alternatives appeared randomly on the left or the right. If the word was correctly selected, the next frame was displayed immediately. If the incorrect alternative was selected, an error message was displayed and a new item was initiated. If the participant made the correct choices throughout the frames for an item, a “CORRECT” message was displayed after the final frame, followed by the beginning of the next item. Participants were told that sometimes the words in a set of frames would make a sentence, but other times they would not. Typical item examples are shown in Figure 2, where each row represents a separate frame. Items were presented in a different random order for each 16

participant. There were eight blocks of 12 items, with a break between each block. Before the first block, subjects were given eight practice items. (FIGURE 2 ABOUT HERE)

Results The dependent variable of interest was RT on frames in both filler and experimental items. RTs to frames on which the subject made an error were discarded, as were RTs shorter than 300 ms and longer than 3000 ms. This resulted in the removal of 3.06% of the data for the measures reported below. Linear mixed-effects models were applied to the RT data for each of the relevant comparisons using the lmer function from the lme4 package in R (Baayen, 2008a, 2008b; Baayen, Davidson, & Bates, 2008; Pinheiro & Bates, 2000; R Development Core Team, 2009). The random-effects structure of these models was systematically examined by fitting models with the full range of random-effects parameter specifications to the data. Likelihood ratio tests were then used to compare among these models. Only the results from the simplest, best-fitting model for each analysis are reported below. In order that the raw RT data would more closely approximate a normal distribution, these data were transformed using a reciprocal transformation prior to analysis. Coefficients with two-tailed p-values less than .05 (based on the zdistribution) were considered statistically significant. For the grammatical and scrambled filler items, the measure of interest was the mean RT over the component frames in each sequence (disregarding the first word, on which the correct response was provided). These mean RTs are presented in Table 1. In this table and in all subsequent tables presenting mean response/reading times, the values are the millisecond equivalents of the means of the transformed data. Linear mixed-effects models were fitted to these RT data, with sequence type (grammatical, scrambled) as a fixed-effect, repeated-measures factor, and with subject and item as random factors. As in Forster et al. (2009), there was a strong effect of grammaticality (t = 6.19, p < .001), such that grammatical sequences were responded to much more quickly than scrambled sequences (grammatical: 753 ms, scrambled: 787 ms). 17

(TABLE 1 ABOUT HERE) Of particular interest in the experimental sentences were the RTs to the ambiguous word (e.g., bat) and the first disambiguating word (e.g., congruent: handle / incongruent: wings) as well as the mean RT over the words in a disambiguating region consisting of the first disambiguating word and the two words immediately following it (e.g., congruent: handle got stuck / incongruent: wings got stuck). It was predicted that the ambiguous word would be integrated more easily into biased sentences, and thus that it would be responded to more quickly in these sentences. For the disambiguating word/region, garden-path effects were predicted when this word/region was incongruent with the biased meaning of the ambiguous word (i.e., in biased-incongruent sentences). In order to evaluate these predictions, linear mixed-effects models were fitted to the RT data for each of the critical words/regions, with bias (biased, control) and congruence (congruent, incongruent) as fixed-effect, repeated-measures factors, and with subject and item as random factors. For this analysis and for all subsequent analyses involving two fixed factors, we first applied a model that included an interaction term between the factors. If the interaction was significant (two-tailed p < . 05, based on the z-distribution) or if it approached significance (two-tailed p < . 1), the interaction term was retained. Otherwise, it was dropped from the model, and an analysis that included only the main effects of the factors was carried out. Table 2 provides the mean RTs for the critical words/regions in each sentence type. (TABLE 2 ABOUT HERE) At the ambiguous word, the interaction of bias and congruence was not significant (t = 1.21). However, the main effect of bias approached significance (t = 1.94, p = .052; main effect of congruence: t = .79), suggesting that ambiguous words in biased sentences were responded to faster than in control sentences [biased: 795 ms, control: 817 ms; i.e., (umpire...) bat < (nurse...) bat)]. At the first disambiguating word, despite a trend suggesting inflated RTs in biased-incongruent sentences, the predicted interaction of bias and congruence was not significant (t = 1.17). The main effects of congruence (t = .80) and bias (t = .84) were also not statistically reliable. In the disambiguating region (the first disambiguating word and the two immediately following words), however, the interaction of 18

bias and congruence was significant (t = 2.06, p = .039), reflecting a distributed, 21 ms garden-path effect for biased-incongruent sentences. In this region, the main effect of bias also approached significance (t = 1.87, p = .061; main effect of congruence: t = 1.34).

Discussion The results of Experiment 1 show that lexical ambiguity resolution can be biased by a prior context containing a word that has a strong co-occurrence relationship with an ambiguous word under one of its meanings. In this way, these findings match well with previous research indicating facilitated meaning access/selection for ambiguous words when the (local) preceding context includes semanticallyrelated or associated words (see above). First, there was a trend suggesting that RTs to the ambiguous word were faster when it was preceded by a biasing word. This result suggests that the ambiguous word was integrated more easily into biased sentences and is consistent with several studies indicating facilitated processing for balanced homographs when they are preceded by constraining contexts (Binder & Morris, 1995; Duffy et al., 1988; Rayner & Frazier, 1989). The influence of the biasing word on meaning selection was also evidenced by the garden-path effect in the disambiguating region of biasedincongruent sentences. Specifically, in biased sentences, the words at and immediately after the first point of disambiguation took longer to select in incongruent sentences, while the opposite pattern was obtained in the relevant control conditions. This effect can be taken to indicate that the biasing word led the comprehension system to select a specific meaning for the ambiguous word, which then needed to reanalyzed/revised when subsequent disambiguating information was incongruent with this meaning. What is novel and somewhat surprising about these findings is that the biasing word in these sentences appeared to influence the meaning assigned to the ambiguous word even though there was no plausibility support for this meaning. This result is of course consistent with a model whereby even in the absence of plausibility support, in the initial stages of recognition for an ambiguous word, low-level intralexical connections can facilitate the meaning selection process. Under the Links-based model proposed above,

19

this facilitation occurs because a meaning selection heuristic registers links to a common (distributionally based) semantic field for the ambiguous word and (one or more) words in the preceding context.

EXPERIMENT 2 A possible criticism of Experiment 1 relates to the task demands of the L-maze. As mentioned above, this task does not strictly require sentence-level processing. Therefore, although the results of Experiment 1 clearly indicated that sentential properties were taken into account (e.g., the faster responses to grammatical sequences), it might nevertheless be the case that a somewhat passive processing mode was engendered by the task -- a mode in which lexical factors and in particular, relationships between words, might have influenced processing more than in normal sentence comprehension. To address this issue, the same materials were tested again in a G-maze task, where the alternatives are always words, but only one continues the sentence. This task requires a far more detailed analysis of the syntactic and semantic properties of the sentence.

Method Participants. Forty-eight (48) undergraduate psychology students at the University of Arizona participated in the experiment for course credit. Materials and design. The experimental items consisted of the 48 sentence sets used in Experiment 1. See Appendix A for the complete list of experimental items. Forty-eight (48) filler sentences were also included. These were the grammatical filler sentences used in Experiment 1. Minor changes were made to several of the experimental and filler items in order to make the length of these items more consistent. (This was done primarily in anticipation of the eye-tracking experiment, Experiment 5, discussed below.) These changes are indicated in Appendix A. Procedure. The maze task procedures largely followed those in Experiment 1. Each item consisted of a series of frames, the first of which was [The …]. Each subsequent frame contained two words side by side, only one of which was a grammatical continuation of the sentence. The same ungrammatical 20

alternatives were used across conditions in each experimental sentence set, and the left-right position of the correct word and its ungrammatical alternative was randomly determined. Participants were instructed to choose the word that best continued the sentence as quickly and as accurately as possible by pushing the corresponding left or right button. If the grammatical continuation was correctly selected, the next frame was displayed immediately. If the incorrect alternative was selected, an error message was displayed and a new item was initiated. If the participant made the correct choices throughout the frames for an item, a “CORRECT” message was displayed after the final frame, followed by the beginning of the next item. Typical item examples are shown in Figure 3, where each row represents a separate frame. Items were presented in a different random order for each participant. There were eight blocks of 12 items, with a break between each block. Before the first block, subjects were given eight practice items. (FIGURE 3 ABOUT HERE)

Results The analysis procedures followed those used in Experiment 1. Again, RTs to frames on which the subject made an error were discarded, as were RTs shorter than 300 ms and longer than 3000 ms. This resulted in the removal of 2.85% of the data for the measures reported below. As in Experiment 1, the RTs to the ambiguous word (e.g., bat), the first disambiguating word (e.g., congruent: handle / incongruent: wings), and the disambiguating region (the mean RT over the first disambiguating word and the two words immediately following it) were examined. Table 3 provides the mean RTs for the critical words/regions in each sentence type. (TABLE 3 ABOUT HERE) At the ambiguous word, the interaction of bias and congruence (t = .195) and the main effect of congruence (t = .625) were not significant. However, there was a strong effect of bias (t = 4.40, p < .001), with ambiguous words in biased sentences being responded to faster than in control sentences [biased: 894 ms, control: 960 ms; i.e., (umpire...) bat < (nurse...) bat]. At the first disambiguating word, there was also a robust interaction of bias and congruence (t = 5.86, p < .001) that reflects a highly localized, 144 21

ms garden-path effect for biased-incongruent sentences. At this word, the main effect of bias was also significant (t = 4.75, p < .001), with disambiguating words in biased sentences being responded to faster than in control sentences [biased: 953 ms, control: 986 ms]. The main effect of congruence was not significant (t = 1.51). The same pattern of results held across the disambiguating region. In this region, while the main effect of congruence was not significant (t = .347), both the main effect of bias (t = 2.18, p = .029) and the interaction of bias and congruence (t = 3.36, p < .001) were statistically reliable. These effects are driven by longer RTs to control sentences generally (biased: 1003 ms, control: 1007 ms) and in particular by a 60 ms garden-path effect for biased-incongruent sentences.

Discussion The results of the G-maze experiment (Experiment 2) are comparable to those of the L-maze experiment (Experiment 1). There was clear evidence that the interpretation of the ambiguous word was influenced by the biasing word. Again, RTs to the ambiguous word were faster in biased sentences -- an effect that presumably reflects the facilitated interpretation and integration of this word. In biasedincongruent sentences, there was also a strong garden-path effect at the disambiguating region and, more specifically, at the first disambiguating word. Much of this effect was driven by inflated RTs at the disambiguating word for biased-incongruent sentences relative to their biased-congruent counterparts (the simple effect of congruence for biased sentences: t = 3.11, p = .002). But there also appeared to be an effect in the opposite direction for control sentences. Although this difference between the control sentences was not statistically reliable (the simple effect of congruence for control sentences: t = 1.32), it is nevertheless necessary to consider possible reasons for this trend. Indeed, it is important to note that a comparable pattern of results was also found at the disambiguating region in Experiment 1. One possibility is that, in the absence of biasing information, the more frequent meaning of the ambiguous word tends to be selected. As noted above, the RMF for the meanings of the ambiguous words in incongruent sentences was slightly, but nevertheless statistically reliably higher than that of the meanings in congruent sentences (incongruent RMF: .44 vs. congruent RMF: .38). This difference can likely 22

account for the apparent default bias toward the meaning of the ambiguous word in incongruent sentences in the absence of other information (e.g., related words and/or global plausibility support) that might otherwise serve to constrain/guide meaning selection (see e.g., Dopkins et al., 1992; Duffy et al., 1988; Forster et al., 2009; Rayner & Duffy, 1986). Although the pattern of results was comparable in Experiments 1 and 2, it is important to note that the predicted biasing effects -- both at the ambiguous word and at disambiguation -- were more robust and localized in the G-maze task than in the L-maze task. As suggested above, this difference can likely be attributed to the nature of these tasks. Although the L-maze might encourage incremental integration into developing sentence representations to some degree, the G-maze task requires a much more active mode of processing. In this way, the G-maze task (arguably) offers a particularly sensitive measure of integration costs, especially for individual words (see e.g., Forster, 2010; Forster et al., 2009; Witzel et al., 2012). In sum, the findings from the G-maze task again indicate that lexical ambiguity resolution can be biased when the (local) preceding context contains a word that has a strong co-occurrence relationship with an ambiguous word under one of its meanings. This is the case even in the absence of plausibility support for the biased meaning. Under the Links-based model proposed in this paper, these biasing effects are attributed to a heuristic that exploits distributional similarities between words to facilitate a postaccess meaning selection process for ambiguous words. There is, however, another possible explanation for these results, namely that a passive form of lexical priming facilitates the recognition of the ambiguous word and thus the ambiguity resolution process. That is, it might be argued that the context effects apparent in these experiments are produced by an automatic spread of activation (Collins & Loftus, 1975) from the representation of the biasing word to that of the related ambiguous word, giving that word (under its appropriate meaning) a "head start" in the word recognition process. It is important to note that this spreading-activation-based explanation is consistent with previous accounts of how intralexical connections might influence meaning access/selection in a modular lexical processor (Prather

23

& Swinney, 1988; Seidenberg et al., 1982; for further discussion of such accounts, see Dopkins, et al., 1992; Duffy et al., 1988). Perhaps the most straightforward way to investigate this alternative account is to run a semantic priming experiment using the biasing and control words as primes, and the ambiguous words as targets. If the ambiguous word (bat) is responded to faster when it is preceded by the biasing word (umpire) compared with the control word (nurse), one might be tempted to conclude that a spreading-activationbased priming interpretation of the biasing effects in Experiments 1 and 2 is a viable alternative (for a comparable line of reasoning, see Dopkins et al., 1992). But it would be surprising if this result were not obtained. Indeed, Lund and colleagues (Lund, Burgess, & Atchley, 1995) found that a semantic priming effect is produced if the prime and target have similar co-occurrence profiles, exactly as is the case for the biasing and control words in the present study. An important point to note, however, is that semantic priming depends to a great extent on the prime and target being adjacent. Any intervening material reduces the effect considerably. For example, with just a single word intervening, Joordens and Besner (1992) reported a priming effect of less than 5 ms, and in an ERP study, Deacon and colleagues (Deacon, Grose-Fifer, Hewitt, Nagata, Shelley-Tremblay, & Yang, 2004) found no effect at all. Since several words intervened between the biasing word and the ambiguous word in Experiments 1 and 2, a more appropriate test of the spreading activation argument would be to include at least one intervening word between the prime and target.

EXPERIMENT 3 This experiment was designed to evaluate a spreading activation account of the biasing effects in Experiments 1 and 2, and in particular of the facilitated processing at the ambiguous word. As discussed above, this alternative account would hold that the lexically-driven meaning disambiguation apparent in these experiments occurs through a relatively passive, automatic process of spreading activation from one word (or word meaning) to another. For such a model to work in these sentences, the activation produced by the biasing word would have to persist across intervening words until the ambiguous word was 24

encountered. This activation would then have to augment the activation in one of the meanings of the ambiguous word, leading to faster recognition and meaning access. One way to test this hypothesis is to determine whether the biasing words prime the recognition of the ambiguous words, i.e., whether umpire primes bat, in a lexical decision experiment with an unrelated word intervening between them. If no effect is found, a spreading activation interpretation of the biasing effect in Experiments 1 and 2 would be undermined. That is, if priming fails to obtain when just one word intervenes between the prime and target, it is difficult to imagine that such priming would occur over the several words (again, 2.83 words on average) intervening between the biasing word and the ambiguous word in the experiments reported above.

Method Participants. Twenty-four (24) undergraduate psychology students at the University of Arizona participated in the experiment for course credit. Materials and design. The 48 ambiguous words in Experiments 1 and 2 were used as targets in a lexical decision experiment. For each target, two primes were selected. The related prime was the biasing word from Experiments 1 and 2, which again had a similar co-occurrence profile to the ambiguous word; the unrelated prime was the control word. Two counterbalanced lists were prepared, so that each target was preceded by the related prime in one list and by the unrelated prime in the other. Each item consisted of a prime, an unrelated intervening word, and a target (e.g., umpire -- service -- BAT). Also included were 48 orthographically legal nonwords (e.g., RIGER, LAVE, SHEROTE, SILE), which served as distractors. Nonword targets were preceded by an unrelated prime and an unrelated intervening word (e.g., awful -physics -- RIGER). Procedure. Each trial consisted of four stimuli. The first was a warning signal (######) presented in the center of the screen. This was followed by the prime in lower case, then the unrelated intervening word also in lower case, and then the target in upper case. All four stimuli were presented for 500 ms, and each stimulus erased the previous one. Participants pressed a “Yes” button if the upper case letter string was a 25

word, otherwise they pressed a “No” button. The instructions emphasized that the first two stimuli would always be words, and that the task was to classify the last letter string (in upper case) as a word or a nonword as quickly and accurately as possible. Following their response, a feedback message was displayed, and then the next item was automatically triggered. The experiment began with 10 practice items.

Results and Discussion The dependent variables of interest were RT and error rate (ER). Items on which the subject made an error were not included in the RT analysis; items with RTs shorter than 300 ms and longer than 1500 ms were discarded from both the RT and ER analyses. This resulted in the removal of 8.94% of the data for the RT analysis and 1.13% of the data for the ER analysis. Linear mixed-effects models were applied to these data with prime type (related, unrelated) as a fixed effect and with subject and item as random factors. In the ER analysis, a binomial family was used. As in Experiments 1 and 2, the RT data were transformed using a reciprocal transformation prior to analysis. The mean RT for related items was 523 ms compared with 518 ms for unrelated items, for a negative priming effect of -5 ms, which was not significant (t = 1.09). There was also no significant ER difference between the related and unrelated conditions (related: 7.81%; unrelated: 10.01%; z = 1.37). This result could either be taken as a demonstration that semantic priming does not occur when the prime and target are non-adjacent (for discussion, see Joordens & Besner, 1992; Masson, 1995), or that the related primes (i.e., the biasing words in Experiments 1 and 2) were not sufficiently connected to the (lexically-ambiguous) targets to produce a priming effect. The latter of these explanations was investigated in a follow-up lexical decision experiment (N = 40) using a standard semantic priming paradigm, in which the prime (e.g., umpire/nurse) was presented for 500 ms, followed immediately by the target (e.g., BAT). Under these conditions, a strong 23 ms priming effect [related: 527 ms, unrelated: 550 ms] was observed (t = 3.36, p < .001) (again, for comparable results, see Lund et al., 1995). Regardless of

26

the reason for the lack of priming in Experiment 3, this finding suggests that a passive, spreadingactivation-based account of the context effects in Experiments 1 and 2 is not a viable option.

EXPERIMENT 4 It could be argued that the previous experiment is not a convincing test of a spreading activation account for the biasing effects in Experiments 1 and 2. Recall that in these experiments, the biasing word facilitated the response to the ambiguous word, which was interpreted to mean that it simplified the process of selecting the appropriate meaning. This interpretation is supported by the garden-path effects in the biased-incongruent sentences in these experiments, which likely arose because a mismatch was detected between the meaning selected for the ambiguous word and subsequent disambiguating information. It is important to note, however, that although ambiguity resolution is required in sentence comprehension, this is not necessarily the case in a simple lexical decision task, where a response can be made without selecting a meaning for the target. That is, spreading-activation-based priming in the processing of ambiguous words may behave quite differently in a sentence compared to a list. Thus, failing to get a semantic priming effect in a simple lexical decision task is not conclusive. It would be more appropriate to test whether an automatic, spreading-activation-based priming effect can be observed between the biasing and ambiguous words in a sentence processing environment. This can easily be arranged in the G-maze task. We simply replace the biasing word with the neutral control word, and relocate it as close to the ambiguous word as possible, but now as an incorrect alternative. Thus, instead of the original example (1a), we present the neutral control sentence (1c) beginning The nurse tried to swallow the bat, but in the frame just prior to the frame containing bat, the word umpire is included as an incorrect alternative. It seems reasonable to suppose that both alternatives need to be processed in a G-maze (see below), so now we have a situation in which the reader can be exposed to bat immediately after umpire, even though this biasing word has not been integrated into the prior sentential context. If spreading activation alone is capable of biasing the interpretation of bat, then this should produce results comparable to those in Experiments 1 and 2 -- i.e., facilitation at the 27

ambiguous word (bat) and a garden-path effect at disambiguating information that is incongruent with the biased interpretation. This approach assumes of course that both alternatives on each frame are processed. In order to check whether this is the case, we ran a preliminary experiment with 39 sentences and 30 participants, comparing frames that contained a grammatical and an ungrammatical alternative with frames in which both alternatives were possible continuations. For example, after selecting the following sequence of words The professor that criticized the -- , participants were presented with either the frame student all (where only student is a possible continuation) or the frame student scientist (where both alternatives are acceptable continuations). If the other alternative is ignored once a possible continuation is identified, then there should be no difference between these conditions. However, the mean selection time for frames with two acceptable continuations (1234 ms) was 268 ms longer than the selection time for frames with a single continuation (966 ms). A mixed-effects regression analysis indicated that this effect was highly significant (t = 9.24, p < .001). This result can only be explained if both alternatives are processed on each frame in the G-maze task. Again, under a spreading-activation-based priming account, the biasing word should influence the interpretation of the ambiguous word, even though it has not been integrated into the sentence. That is, the mere act of reading the word should cause activation to spread from the biasing word to the ambiguous word, thereby raising its activation level under the appropriate meaning. As in Experiments 1 and 2, the presence of this biasing word should thus facilitate the response to the ambiguous word and should lead to a garden-path effect in biased-incongruent sentences.

Method Participants. Forty (40) undergraduate psychology students at the University of Arizona participated in the experiment for course credit. Materials and design. The 48 test sentences from Experiment 2 were modified for this experiment. In the biased sentences, the biasing word (e.g., umpire) was replaced with the control word (e.g., nurse). Each 28

biasing word was then moved to a position as close to the ambiguous word (e.g., bat) as possible (usually immediately preceding it or with only one frame intervening, for an average of .58 intervening words), as an incorrect alternative. It was necessary to adjust 13 of the original sentence sets (specifically, by adding an adverb to the region preceding the ambiguous word) in order to allow the biasing word to appear next to (or very close to) the ambiguous word. Except for this minor change, the control sentences were the same as in Experiment 2. An example item set for this experiment is shown in Figure 4. The same filler items as in Experiment 2 were used. Procedure. As in Experiment 2. (FIGURE 4 ABOUT HERE)

Results The analysis procedures followed those used in Experiments 1 and 2. Again, RTs to frames on which the subject made an error were discarded, as were RTs shorter than 300 ms and longer than 3000 ms. This resulted in the removal of 3.48% of the data for the measures reported below. As in Experiments 1 and 2, the RTs to the ambiguous word (e.g., bat), the first disambiguating word (e.g., congruent: handle / incongruent: wings), and the disambiguating region (the mean RT over the first disambiguating word and the two words immediately following it) were examined. Table 4 provides the mean RTs for the critical words/regions in each sentence type. (TABLE 4 ABOUT HERE) At the ambiguous word, the interaction of bias and congruence (t = 0.350) and the main effect of congruence (t = 0.741) were not significant. The main effect of bias, however, approached significance (t = 1.74, p = .081). In contrast to the results of Experiments 1 and 2, this trend suggests that the ambiguous word in biased sentences was responded to slower than in control sentences [biased: 976 ms, control: 949 ms]. Also in contrast to the findings for Experiments 1 and 2, no statistically reliable effects were found at the first disambiguating word (main effect of congruence: t = 1.17; main effect of bias: t = .562;

29

interaction: t = .395) or in the disambiguating region (main effect of congruence: t = .120; main effect of bias: t = 1.08; interaction: t = .540).

Discussion When included as an incorrect alternative, the biasing word (umpire) did not facilitate the processing of the ambiguous word (bat) or otherwise appear to influence meaning selection for this word. First, there is no evidence that the biasing word had a facilitative effect on RTs to the ambiguous word. If anything, there was a trend in precisely the opposite direction. There was also no garden-path effect at the first disambiguating word or in the disambiguating region. In fact, at disambiguation, the trend for both biased and control sentences suggested longer RTs for congruent sentences. As discussed above, this trend might be attributable to the fact that the meanings of the ambiguous words in incongruent sentences had slightly higher RMF values than their counterparts in congruent sentences. Thus, this might be taken as another indication that, in the absence of biasing information, the more frequent meaning of the ambiguous word tends to be selected. The important implication of these results is that the processing and interpretation of an ambiguous word do not appear to be influenced by mere exposure to a related word. These findings therefore present further problems for a passive, spreading activation account of the biasing effects observed in Experiments 1 and 2. Moreover, taken together with the results of these earlier experiments, the present set of findings also indicate that a biasing word only influences meaning selection for an ambiguous word when it is incorporated into the developing sentence structure. This last point of course suggests a possible objection to the argument made above -- namely that spreading activation might work very differently between words in a sentence than it does between words that are not connected in this way. Such an objection, however, would seem to underscore the idea that the biasing effects observed in the earlier experiments cannot be due to automatic spreading activation at the lexical level -- a process that, by definition, should take place regardless of sentence context.

30

EXPERIMENT 5 In this final experiment, we take up the question of whether the biasing effects in Experiments 1 and 2 are somehow a product of the restricted processing mode that the maze task encourages. To this end, it is valuable to see whether comparable effects can be observed with a more natural task, in which the subject is free to view the input without any constraint. An obvious methodology for such an investigation is eye tracking. Therefore, in this experiment, the same sentences as in Experiment 2 were tested using an eye-tracking task.

Method Participants. Thirty-seven (37) undergraduate psychology students at the University of Arizona participated in the experiment for course credit. All participants had normal or corrected-to-normal vision. Materials and design. The materials were the same as those in Experiment 2 (see Appendix A). As in this earlier experiment, four counterbalanced lists were created, each with 48 experimental items and 48 filler items. In order to encourage comprehension, yes-no questions followed 12 experimental items (three questions per condition) and 24 fillers. Procedure. Sentences were presented as single lines of text (with standard capitalization and punctuation) on a 21-inch CRT monitor. Participants were instructed to read the sentences silently at their natural reading speed, making sure to comprehend well enough to accurately answer occasional yes-no questions. Eye movements were recorded from the right eye using a Dr. Bouis Monocular Oculometer, at a sampling rate of 200 Hz. The distance from the participant’s eye to the monitor was approximately 60 cm, which allowed for single character resolution. A bite plate and headrest were used to minimize head movements. The tracker was calibrated at the beginning of the experiment and then recalibrated after every four trials. Each trial began with a fixation mark (an asterisk) near the left edge of the computer screen. The sentence was then displayed, with its first character one space to the left of the fixation point. Participants pushed a button under their right hand as soon as they finished reading the sentence, at which point it was removed from the screen. If the item was not followed by a comprehension question, a string of dashes appeared on 31

the screen, signaling the participant to proceed to the next trial when ready by pressing the right button. If the sentence was followed by a comprehension question, participants answered “Yes” with the right button or “No” with the button under their left hand. Items were presented in a different pseudo-random order for each participant, such that experimental items were not presented on successive trials. The experiment began with eight practice items.

Results The data from one participant with an error rate greater than 25% on the comprehension questions were not included in the analyses. The data from trials with major tracker losses were also excluded. This led to the removal of 3.30% of the data from experimental trials. Experimental items were separated into five regions for analysis. The analysis regions were the sections of the sentences that were directly relevant to the hypotheses under investigation -- the ambiguous word and the words immediately following it as well as the words at and after disambiguation. These regions are indicated in the sample item set in Table 5 and were defined as follows -- region 1: the region containing the ambiguous word; region 2: the words preceding the disambiguating region; region 3: the first disambiguating word; region 4: the two words immediately following disambiguation; and region 5: the remaining words in the sentence. (TABLE 5 ABOUT HERE) For each region, several measures were calculated: first fixation duration, gaze duration, go-past time, (first-pass) regression rate, (first-pass) skipping rate, and total reading time (RT). First fixation duration refers to the duration of the first fixation (of at least 50 ms) in a region, provided that the region was fixated on during the reader’s initial pass through the sentence. Gaze duration is the sum of the fixation durations in a region (again, on the initial pass through the sentence) before leaving that region in either direction. Go-past time is the sum of the first-pass fixation durations after entering a region, before moving out of that region to the right. This measure includes regressive fixations to previous regions of the sentence. Regression rate refers to the proportion of trials on which the reader had a regressive eye 32

movement from a given region to a previous region during the initial pass through the sentence. Skipping rate is the proportion of trials on which the region was skipped over (i.e., not fixated on) during first-pass reading. Total RT is simply the sum of all fixation durations in the region (before the subject terminated the display of the item). Participants' mean comprehension question accuracy was 90.1%. Table 5 presents the means for each of the dependent measures by condition and region. The analysis procedures for the reading time measures followed those of Experiments 1, 2, and 4. In the present experiment, however, the log transformation provided a better approximation of the normal distribution across these measures, so all reading time data was log transformed prior to analysis. Linear mixed-effects models were fitted to the data for the relevant measures in each region, with bias (biased, control) and congruence (congruent, incongruent) as fixed, repeated-measures factors, and with subject and item as random factors. In a preliminary analysis of each measure, an interaction term between the two fixed effects was included. If this effect was not significant (two-tailed p < . 05, based on the z-distribution) or approaching significance (two-tailed p < . 1), the interaction term was dropped from the model, and a new analysis that included only the main effects of the factors was carried out. Following the ER analysis in Experiment 3, the analyses of the binary variables -- regression rate and skipping rate -- used a binomial family to fit each model. Table 6 presents the results of these analyses. For ease of exposition, only results that were statistically significant or that approached significance are reported below. (TABLE 6 ABOUT HERE) At the ambiguous word (region 1), the only effect that approached significance was the main effect of bias for first fixation duration (t = 1.66, p = .097), indicating a trend toward shorter first fixations in biased sentences (biased: 234 ms, control: 240 ms). Comparably, in the region immediately following the ambiguous word (region 2), there was a significant main effect of bias for regression rate (z = 2.25, p = .025), with more first-pass regressions in control sentences than in biased sentences (biased: .105, control: .140). This was accompanied by a main effect of bias for go-past time that approached significance (t = 1.83, p = .067), revealing a trend toward shorter reading times for biased sentences 33

(biased: 436 ms, control: 460 ms). For first fixation duration, there was also a significant interaction of bias and congruence (t = 2.02, p = .043). Although this interaction is difficult to interpret, the other results in these regions suggest relative reading ease at and immediately after the ambiguous word in biased sentences. At the first disambiguating word (region 3), the effect of congruence was significant for first fixation duration (t = 1.97, p = .049) and approached significance for total RT (t = 1.87, p = .061). Although this effect indicated that incongruent sentences were generally read more slowly than their congruent counterparts (first fixation duration -- congruent: 238 ms, incongruent: 247 ms; total RT -congruent: 334 ms, incongruent: 361 ms), this difference appeared to be stronger under the biased condition. Indeed, tests of the simple effect of congruence revealed that for total RT, there was a statistically-reliable difference for biased sentences (t = 2.91, p = .004) (biased-congruent: 327 ms, biased-incongruent: 365 ms), but not for control sentences (t < 1) (control-congruent: 340 ms, controlincongruent: 357 ms). Comparably, for first fixation duration, only the difference between the biased sentences approached significance (t = 1.74, p = .082; control: t = 1.08) (biased-congruent: 236 ms, biased-incongruent: 246 ms, control-congruent: 240 ms, control-incongruent: 247 ms). Of course, because the interaction of bias and congruence was not significant for either of these measures, it is important to interpret these simple effects with some caution. Having said that, the inflated reading times for biasedincongruent sentences are nevertheless consistent with a garden-path effect at the first disambiguating word in these sentences. Two other trends at this word are also worthy of note. The effect of bias approached significance for skipping rate (t = 1.75, p = .081), suggesting that the first disambiguating word tended to be skipped more often in biased sentences generally. The interaction of bias and congruence also approached significance for regression rate (t = 1.68, p = .093), but this trend was not in the predicted direction. In the region immediately following the first disambiguating word (region 4), the predicted interaction of bias and congruence was significant for go-past time (t = 2.51, p = .012), reflecting a 55 ms garden-path effect for biased-incongruent sentences (see Figure 5). Also in this region, there was a 34

significant main effect of congruence for skipping rate (z = 2.64, p = .008), with this region being skipped more often in congruent sentences than in incongruent sentences. The garden-path effect in this region also appeared to spill over into the final region of the sentence (region 5). In this final region, an interaction consistent with inflated reading times for the biased-incongruent sentences was statistically reliable for total RT (t = 2.31, p = .008) (garden-path effect: 54 ms) and approached significance for gaze duration (t = 1.65, p = .099) (garden-path effect: 40 ms). The latter of these findings is somewhat surprising in light of the fact that clear garden-path effects for gaze duration were not found either at the first disambiguating word (region 3) or in the immediately following region (region 4). In a follow-up analysis for this measure, the first disambiguating word was combined with the immediately following region to make a larger disambiguating region -- a region that was exactly the same as the disambiguating region analyzed for the maze tasks in Experiments 1, 2, and 4. In this region, the interaction of bias and congruence was significant (t = 2.08, p = 0.038; main effect of bias: t = 1.83, p = 0.067; main effect of congruence: t < 1), again indicating a garden-path effect for biased-incongruent sentences (biasedcongruent: 520 ms, biased-incongruent: 578 ms, control-congruent: 562 ms, control-incongruent: 552 ms; garden-path effect: 68 ms) (see Figure 6). (FIGURE 5 ABOUT HERE) (FIGURE 6 ABOUT HERE)

Discussion The overall pattern of results from this experiment matches those obtained in the maze task experiments, Experiments 1 and 2. As in these earlier experiments, the results indicated (i) facilitated processing of the ambiguous word when the preceding context included a biasing word and (ii) gardenpath effects when subsequent disambiguating information was incongruent with the biased meaning of the ambiguous word (i.e., in biased-incongruent sentences). These results therefore suggest that the biasing effects observed in Experiments 1 and 2 were not merely artifacts of the maze procedure, but rather that

35

they reflect more general processes involved in lexical ambiguity resolution during sentence comprehension. There are, however, some discrepancies in terms of the timing and localization of the effects among these experiments. First, recall that both maze experiments revealed shorter response times at the ambiguous word in biased sentences -- a result that was particularly robust in the G-maze experiment (Experiment 2). In this eye-tracking experiment, however, facilitated processing of the ambiguous word was not clearly indicated at this word itself (where there was only a trend indicating shorter first fixation durations in biased sentences), but rather at the region just after this word -- primarily by a lower rate of first-pass regressive eye movements in biased sentences. On the surface, this would also appear to conflict with previous eye-tracking studies that have shown facilitated reading times (in terms of first fixation duration and/or gaze duration) for balanced ambiguous words when preceded by constraining contexts (Binder & Morris, 1995; Duffy et al., 1988; Rayner & Frazier, 1989). However, several methodological differences make straightforward comparison with these results somewhat difficult. For instance, Binder and Morris (1995) and Duffy and colleagues (1988) found these reading time effects by comparing balanced ambiguous words against unambiguous control words, predicting no difference between these word types in constraining contexts. This contrasts with the present study in which the ambiguous word was used as its own control and in which reading differences for this word were predicted depending on the nature of the preceding context (biasing or control). A comparable design was used in Rayner and Frazier (1989), but in their study, the position of the balanced ambiguous word was very different depending on whether it was preceded by constraining context -- occurring sentence-medially when this context was available and sentence-initially when it was not. In the present experiment, however, the position of the ambiguous word was held constant, regardless of context type. With regard to the timing/localization of the garden-path effect at disambiguating information, the results of the eye-tracking experiment are comparable to those obtained in the L-maze task. As in Experiment 1, the clearest indication of this effect was revealed across the entire disambiguating region and, to some extent, from initial disambiguation until the end of the sentence. This pattern of results 36

contrasts with that of the G-maze experiment (Experiment 2), in which this garden-path effect was most strongly indicated at the first disambiguating word. In the eye-tracking experiment, a garden-path effect at this word was only suggested by the simple effects of congruence under the total time measure (i.e., total time: biased-incongruent > biased-congruent; control-incongruent ≈ biased-congruent). It is important to note that a very similar pattern of results was found in Forster et al.’s (2009, Experiment 3) examination of lexical ambiguity using the G-maze task. Specifically, whereas Forster et al.’s experiment revealed a robust garden-path effect at the first disambiguating word, the Dopkins et al. (1992) eye-tracking experiment on which it was based revealed this effect only for total time, but not for first-pass reading time measures such as gaze duration. In fact, similar to the present eye-tracking investigation, Dopkins and colleagues’ experiment revealed a garden-path effect for first-pass reading time measures only across the entire disambiguating region of the sentences (2-7 word in length; for other eye-tracking studies showing distributed/delayed disambiguation effects for sentences involving lexical ambiguity, see Duffy et al., 1988; Frazier & Rayner, 1990; Rayner & Duffy, 1986; Rayner & Frazier, 1989; Sheridan et al., 2009). In light of these previous findings, it is perhaps not surprising that the present eye-tracking experiment revealed somewhat delayed and distributed effects compared with those obtained in its Gmaze counterpart. Indeed, these findings appear to provide further confirmation that the G-maze task offers an especially sensitive measure of local processing costs associated with the incremental integration of words into developing sentence representations (see also Witzel et al., 2012).

GENERAL DISCUSSION Taken together, these experiments show that meaning selection for an ambiguous word can be biased by the prior presentation of a word with which it shares a strong co-occurrence relationship under one of its meanings, despite the absence of plausibility support for this biased interpretation. In the initial L-maze and G-maze experiments (Experiments 1 and 2, respectively), the interpretation of the ambiguous word was clearly influenced when a distributionally related biasing word appeared in the preceding context. In both experiments, RTs to the ambiguous word were faster in these biased sentences. There 37

were also inflated RTs to subsequent disambiguating words that conflicted with the biased interpretation of the ambiguous lexical item (i.e., to disambiguating words in biased-incongruent sentences). Crucially, and in contrast to previous studies, these biasing effects were obtained despite the absence of plausibility support for the biased meaning of the ambiguous word. Subsequent experiments further indicated that these effects are not likely due to a passive process of spreading-activation-based priming. Such priming effects largely depend on the prime and target being adjacent, and as shown in Experiment 3, recognition of the ambiguous word was not affected by the biasing word when an unrelated word intervened between them. Moreover, in Experiment 4, when the biasing word was included as an incorrect alternative in a Gmaze task, it did not facilitate the processing of the ambiguous word or otherwise influence meaning selection for this word. Thus, even in a task that engages sentence-level processing, the mere presentation of the biasing word did not affect the processing of the ambiguous word -- a finding that again runs contrary to a spreading activation account of the biasing effects observed in this study. Finally, because the eye-tracking experiment (Experiment 5) revealed biasing effects comparable to those obtained in the corresponding maze task experiments (Experiments 1 and 2), it is unlikely that these effects are artifacts of the incremental mode of processing that is encouraged by the L-maze and that is required by the Gmaze. It should also be emphasized that this study not only underscores the importance of intralexical connections in online lexical ambiguity resolution, but also proposes a mechanism whereby contextual information could be used quickly and reliably during sentence comprehension. As discussed above, this is something that is conspicuously absent from models of lexical ambiguity resolution in sentence contexts to date. Specifically, it is proposed that the same co-occurrence-based lexical links that account for selective verification in the turple experiments (Forster, 2006; Forster & Hector, 2002) act as the basis for a first-pass, heuristic method of resolving lexical ambiguities. As detailed above, the turple effect in a semantic categorization task (i.e., nonword neighbors of category exemplars such as turple in an ANIMAL categorization task show interference, but nonword neighbors of non-exemplars such as tabric do not) indicates that some semantic information must be available very early in the word recognition 38

process. This information is assumed to be provided by lexical links to semantic fields based on lexical co-occurrence statistics (Forster, 2006). Applied to the problem of lexical ambiguity resolution in sentence contexts, this model posits that when an ambiguous word is accessed, its lexical entry is found to have links to distinct semantic fields. The lexical processor can then search the links associated with words in the preceding context in an attempt to find a matching link. When matching links are detected, the meaning for the ambiguous word in the semantic field to which they point becomes activated and available to higher-order comprehension systems, while other meanings remain relatively dormant. The results reported above are consistent with the basic predictions of this Links-based heuristic. Again, online meaning selection was clearly biased by the prior presentation of a word that is distributionally similar to the ambiguous word under one of its meanings. In terms of the proposed model, this can be interpreted to indicate that the comprehension system is indeed able to exploit shared links to cooccurrence-based semantic fields during lexical ambiguity resolution. Furthermore, these effects were obtained even though the biased interpretation the ambiguous word was not supported by overall plausibility. This is again consistent with the proposed model, in that low-level intralexical connections appeared to facilitate meaning selection for ambiguous words independently of information provided by higher-order considerations of plausibility or pragmatics. One possible complication for this account is the fact that only words that have been integrated into the sentence structure appear to bias the meaning selection process for ambiguous words, as shown in Experiment 4. This raises an interesting problem for a strictly modular system, in which the informationally encapsulated lexical module (e.g., Forster, 1979) knows nothing about sentence-level activities. Such a system could not possibly track which words had been integrated into the developing sentence structure. How then would a co-occurrence-based heuristic for lexical ambiguity resolution be able to consider links from only recently-integrated words? One possible answer to this question relates to necessary limits on the search space. The search for a matching link must be confined to those associated with a relatively small set of words (otherwise, it would quickly become useless), and it is reasonable to suppose that it is limited to elements that are actively stored in (verbal) working memory. Words that 39

have been recently integrated into the sentence must be stored in this memory system so that they can contribute to subsequent syntactic and semantic processing (see e.g., Lewis, Vasishth, & Van Dyke, 2006). In effect, the modular lexical processor can be conceived of as simply keeping track of the links for these words and making this data available so that higher-level systems can use that information during the meaning selection process. Another possible criticism of this model is that while we have attributed the relevant intralexical connections in this meaning selection heuristic to distributional similarities between words, we have not explicitly ruled out other sources of lexical association or semantic relatedness as the basis of these links. This focus on co-occurrence similarities was primarily driven by theoretical considerations. That is, these similarities not only provide easily quantifiable indicators of meaning relatedness, but they are also based on distributional information that would (arguably) be accessible to the lexical processor. Furthermore, as detailed above, this co-occurrence information has also been used to explain selective verification in semantic categorization under the Links model -- the model that provides the basis for the lexical ambiguity heuristic proposed in this study. For these conceptual reasons, distributional similarities between words were of particular interest. It remains to be determined empirically, however, whether this is the best way to model the lexical connections that appear to bias meaning selection for ambiguous words. Settling this issue would take us far beyond the scope of the present paper. Another potential criticism of this model is that despite our efforts to minimize any effect of relative plausibility, it is nevertheless the overall plausibility of the developing proposition that biases meaning selection for the ambiguous words in these sentences. To take a specific example, the proposal would be that in the sentence beginning The umpire tried to swallow the bat, the occurrence of the biasing word umpire in the subject of the sentence establishes the setting as that of a baseball game, and therefore bat is more likely to refer to a baseball bat than a flying bat. However, this explanation begs the question as to why the biasing word in particular would set the parameters for all subsequent plausibility evaluations. Again with reference to the specific example above, it is unclear why the word umpire would force the system to interpret all the words that followed in terms of a specific baseball context and to 40

essentially ignore the plausibility considerations introduced by other words. Surely, the word swallow, in combination with its human (umpire or nurse) subject, would introduce its own suggestions for plausible continuations. Among those would be possible objects for this verb, which would presumably include things of suitable size, consistency, and type (e.g., food) for human consumption. A baseball bat would clearly not be included in this set of possible objects. Further experimentation is of course necessary in order to tease apart the influence of lexical connections and global plausibility, as well as to determine how these information sources interact during meaning selection for ambiguous words. It would be interesting for instance to examine sentences in which co-occurrence profiles suggest one interpretation, while plausibility suggests another. An example of such a sentence might be something like the following: The umpire tried to trap the bat, but it rolled/flew away. This type of experiment would provide indications of potential conflict between these sources of biasing information, as well as of the relative timing and strength of their influences on meaning selection. One indication of the enduring influence of lexical connections on this process comes from a sentence-completion experiment by Reder (1983). In this experiment, participants produced more errors on continuations of sentences like "The plumber, who repaired the sewer, lit his pipe..." than on neutral controls ("The groom, who took the message, lit his pipe..."). That is, subjects appeared to be led down the garden path by lexical connections despite local plausibility support (in this case, provided by the verb “lit”) for the appropriate meaning for the ambiguous word. Another important question to consider is whether co-occurrence-based links influence other aspects of lexical processing during sentence comprehension. In particular, it is necessary to consider whether this heuristic operates during meaning selection for unambiguous words as well. One possibility is that ambiguity is always present, since even unambiguous words may have several different senses (polysemy), and hence co-occurrence information is always relevant. Alternatively, it could be that polysemy is treated quite differently from homonymy (two words spelled the same way), and sense selection occurs much later, at a higher level (see e.g., Frazier & Rayner, 1990), in which case cooccurrence information would be irrelevant. Nevertheless, the proposed process that accesses these links 41

is a heuristic procedure, so it would likely search for matching pointers to semantic fields whenever the links for a word become available. It is important to note that this would confer little, if any processing advantage to words that do not have multiple, distinct meanings. That is, if a link to only one semantic field is available, that link could be reliably followed regardless of whether a matching link was detected in the prior context. This model, therefore, would not predict facilitated processing across the board for words that are semantically or associatively related to words in the preceding context -- an idea that would seem to comport well with the generally weak and inconsistent indications of semantic or associative priming in sentence contexts (see e.g., Camblin, Gordon, & Swaab, 2007 and references therein).

Methodological implications Finally, it is important to note the potential methodological implications of the present study. Consistent with previous studies (Forster et al., 2009; Witzel et al., 2012), the maze task experiments reported above, and the G-maze version in particular, appear to provide precisely localized estimates of online processing time differences. Indeed, the G-maze experiment (Experiment 2) revealed the predicted effects of the biasing context at the ambiguous word itself and at the first disambiguating word. The eyetracking experiment (Experiment 5), by contrast, showed a biasing effect just after the ambiguous word and a rather distributed garden-path effect in the disambiguating region. One interpretation of these delayed/distributed effects is that individual readers differ in the way they deal with ambiguity in normal reading. Some may pause as soon as an ambiguous item or disambiguating information is encountered, and proceed only when the necessary (re)analysis has been completed, which would produce very localized effects. This could be described as “careful” reading. Others may move past these words, perhaps looking for further information that may clarify the interpretation of the sentence. If no such information is found, a regressive eye movement may be triggered so that previous information can be reexamined. Because the point in the sentence at which this occurs may differ across individuals, a distributed effect is to be expected. This could be described as “risky” reading. Still other readers may not 42

recognize the problem posed by the ambiguous word and may continue to the end of the sentence without having developed a clear understanding of its content, but one that is nevertheless sufficient to answer a subsequent comprehension question correctly. Given these individual differences, it is not surprising that processing costs are often not tied to particular words. But in the maze task, and in the G-maze task in particular, these differences in reading style are brought under experimental control, and all readers are forced to adopt the same (or at least a more comparable) strategy of incremental parsing and semantic integration. Admittedly, for some readers this may be a highly unnatural strategy, but by standardizing the approach to reading, the maze task can isolate processing time differences precisely to predicted words/regions. Another factor worth considering is that the maze task does not require comprehension questions to ensure that sentences are being understood. The G-maze task in particular is almost impossible unless the reader is actively interpreting the sentence as it unfolds. While this may seem like a mere convenience, this task characteristic points to a more fundamental issue -- that the difficulty of comprehension question is a relatively uncontrolled variable in eye-tracking and self-paced reading studies. Some experiments may have very difficult questions, which forces slow, careful processing of the sentence, while others may have very simple questions, which allows for shallower processing (for more on the influence of comprehension questions on self-paced reading times and eye-movement patterns, see e.g., Swets, Desmet, Clifton, & Ferreira, 2008; Wotschack & Kliegl, in press). Again, the maze task brings this variable under control. All readers have to process the sentence carefully enough to perform the task accurately. And while there may be some variation in how carefully participants choose to process the sentence, this variable would appear to be better controlled across different experiments and laboratories with the maze procedure.

43

NOTE 1

The complete set of materials for this and all other experiments (including filler items) will be provided

upon request.

44

REFERENCES Baayen, R.H. (2008a). Analyzing linguistic data: A practical introduction to statistics. Cambridge, UK: Cambridge University Press. Baayen, R.H. (2008b). languageR: Data sets and functions with "Analyzing Linguistic Data: A practical introduction to statistics". R package version 0.953. Baayen, R.H., Davidson, D.J., & Bates, D.M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59, 390–412. Binder, K.S., & Morris, R.K. (1995). Eye movements and lexical ambiguity resolution: Effects of prior encounter and discourse topic. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 1186-1196. Binder, K.S., & Rayner, K. (1988). Contextual strength does not modulate the subordinate bias effect: Evidence from eye fixations and self-paced reading. Psychonomic Bulletin and Review, 5, 271276. Burgess, C., & Lund, K. (1997). Modeling parsing constraints with high-dimensional context space. Language and Cognitive Processes, 12, 177-210. Camblin, C.C., Gordon, P.C., & Swaab, T.Y. (2007). The interplay of discourse congruence and lexical association during sentence processing: Evidence from ERPs and eye tracking. Journal of Memory and Language, 56, 103-128. Collins, A. M., & Loftus, E. F. (1975). A spreading-activation theory of semantic processing. Psychological Review, 82, 407-428. Cousineau, D. (2005). Confidence intervals in within-subject designs: A simpler solution to Loftus and Masson’s method. Tutorials in Quantitative Methods for Psychology, 1, 42-45. Deacon, D., Grose-Fifer, J., Hewitt, S., Nagata, M., Shelley-Tremblay, J., & Yang, C-M. (2004). Physiological evidence that a masked unrelated intervening item disrupts semantic priming: Implications for theories of semantic representation and retrieval models of semantic priming. Brain and Language, 89, 38-46. 45

Dopkins, S., Morris, R.K., & Rayner, K. (1992). Lexical ambiguity and eye fixations in reading: A test of competing models. Journal of Memory and Language, 31, 461-476. Duffy, S.A., Kambe, G., & Rayner, K. (2001). The effect of prior disambiguating context on the comprehension of ambiguous words: Evidence from eye movements. In D. Gorfein (Ed.), On the consequences of meaning selection: Perspectives on resolving lexical ambiguity (pp. 27-43). Washington, DC: American Psychological Association. Duffy, S.A., Morris, R.K., & Rayner, K. (1988). Lexical ambiguity and fixation times in reading. Journal of Memory and Language, 27, 429-446. Fodor, J.A. (1983). Modularity of mind. Cambridge, MA: MIT Press. Foltz, P.W., Kintsch, W., & Landauer, T.K. (1998). The measurement of textual coherence with Latent Semantic Analysis. Discourse Processes, 25, 285-307. Forster, K.I. (1979). Levels of processing and the structure of the language processor. In W.E. Cooper & E.C.T. Walker (Eds.), Sentence Processing: Psycholinguistic essays presented to Merrill Garrett (pp. 27–85). Hillsdale, N.J.: Erlbaum. Forster, K.I. (2006). Early activation of category information in visual word recognition: More on the turple effect. The Mental Lexicon, 1, 35-58. Forster, K.I. (2010). Using a maze task to track lexical and sentence processing. The Mental Lexicon, 5, 347-357. Forster, K.I., & Bednall, E.S. (1976). Terminating and exhaustive search in lexical access. Memory and Cognition, 4, 53-61. Forster, K.I., & Forster, J.C. (2003). DMDX: A Windows display program with millisecond accuracy. Behavior Research Methods, Instruments, and Computers, 35, 116-124. Forster, K.I., Guerrera, C., & Elliot, L. (2009). The Maze Task: Measuring forced incremental sentence processing time. Behavioral Research Methods, 41, 163-171. Forster, K.I., & Hector, J. (2002). Cascaded versus noncascaded models of lexical and semantic processing: The turple effect. Memory and Cognition, 30, 1106–1116. 46

Frazier, L., & Rayner, K. (1990). Taking on semantic commitments: Processing multiple meanings vs. multiple senses. Journal of Memory and Language, 29, 181-200. Gorfein, D.S. (1989). Resolving semantic ambiguity. New York: Springer-Verlag. Gorfein, D.S. (2001). On the consequences of meaning selection: Perspectives on resolving lexical ambiguity. Washington, DC: American Psychological Association. Hirst, G. (1988). Resolving lexical ambiguity computationally with spreading activation and polaroid words. In S.I. Small, G.W. Cottrell, & M.K. Tanenhaus (Eds.), Lexical ambiguity resolution: Perspectives from psycholinguistics, neuropsychology, and artificial intelligence (pp. 73-108). San Mateo, CA: Morgan Kaufmann Publishers. Hogaboam, T.W., & Perfetti, C.A. (1975). Lexical ambiguity and sentence comprehension. Journal of Verbal Learning and Verbal Behavior, 14, 265-274. Joordens, S., & Besnar, D. (1992). Priming effects that span an intervening unrelated word: Implications for models of memory representation and retrieval. Journal of Experimental Psychology: Learning, Memory, and Cognition, 18, 483-491. Joseph, H.S.S.L., Liversedge, S.P., Blythe, H.I., White, S.J., Gathercole, S.E., & Rayner, K. (2008). Children’s and adults’ processing of anomaly and implausibility: Evidence from eye movements. Quarterly Journal of Experimental Psychology, 61, 708-723. Kintsch, W. (2000). Metaphor comprehension: A computational theory. Psychonomic Bulletin and Review, 7, 257-266. Landauer, T.K., & Dumais, S. T. (1997). A solution to Plato's problem: The Latent Semantic Analysis theory of acquisition, induction, and representation of knowledge. Psychological Review, 104, 211-240. Lewis, R.L., Vasishth, S., & Van Dyke, J.A. (2006). Computational principles of working memory in sentence comprehension. Trends in Cognitive Sciences, 10, 447-454. Lund, K., & Burgess, C. (1996). Producing high-dimensional semantic spaces from lexical co-occurrence. Behavior Research Methods, Instruments, and Computers, 28, 203-208 47

Lund, K., Burgess, C., & Atchley, R.A. (1995). Semantic and associative priming in high dimensional semantic space. In the Proceedings of the Seventeenth Annual Conference of the Cognitive Science Society (pp. 660-665). Hillsdale, NJ: Erlbaum. Masson, M.E.J. (1995). A distributed memory model of semantic priming. Journal of Experimental Psychology: Learning, Memory, and Cognition, 21, 3–23. Nicol, J.L., Forster, K.I., & Veres, C. (1997). Subject-verb agreement processes in comprehension. Journal of Memory and Language, 36, 569-587. Onifer, W., & Swinney, D.A. (1981). Accessing lexical ambiguities during sentence comprehension: Effects of frequency of meaning and contextual bias. Memory and Cognition, 15, 225-236. Pecher, D., Zeelenberg, R., & Wagenmakers, E.J. (2005). Enemies and friends in the neighborhood: Orthographic similarity effects in semantic categorization. Journal of Experimental Psychology: Learning, Memory, and Cognition, 31, 121–128. Pinheiro, J.C., & Bates, D.M. (2000). Mixed-effects models in S and S-PLUS. New York: Springer. Prather, P.A., & Swinney, D.A. (1988). Lexical processing and ambiguity resolution: An autonomous process in an interactive box. In S.I. Small, G.W. Cottrell, & M.K. Tanenhaus (Eds.), Lexical ambiguity resolution: Perspectives from psycholinguistics, neuropsychology, and artificial intelligence (pp. 289-310). San Mateo, CA: Morgan Kaufmann Publishers. R Development Core Team. (2009). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Rayner, K., & Duffy, S.A. (1986). Lexical complexity and fixation times in reading: Effects of word frequency, verb complexity, and lexical ambiguity. Memory and Cognition, 14, 191-201. Rayner, K., & Frazier, L. (1989). Selection mechanisms in reading lexically ambiguous words. Journal of Experimental Psychology: Learning, Memory, and Cognition, 15, 779-790. Rayner, K., Warren, T., Juhasz, B.J., & Liversedge, S.P. (2004). The effect of plausibility on eye movements in reading. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 1290–1301. 48

Reder, L.M. (1983). What kind of pitcher can a catcher fill? Effects of priming in sentence comprehension. Journal of Verbal Learning and Verbal Behavior, 22, 189-202. Rodd, J.M. (2004). When do leotards get their spots? Semantic activation of lexical neighbors in visual word recognition. Psychonomic Bulletin and Review, 11, 434–439. Rohde, D.L.T., Gonnerman, L.M., & Plaut, D.C. (2005). An improved method for deriving word meaning from lexical co-occurrence. Available on the web at http://tedlab.mit.edu/~dr/Papers/RohdeGonnermanPlaut-COALS.pdf. Seidenberg, M.S., Tanenhaus, M.K., Leiman, J.M., & Bienkowski, M. (1982). Automatic access of the meanings of ambiguous words in context: Some limitations of knowledge-based processing. Cognitive Psychology, 14, 489-537. Sheridan, H., Reingold, E.M, & Daneman, M. (2009). Using puns to study contextual influences on lexical ambiguity resolution: Evidence from eye movements. Psychonomic Bulletin and Review, 16, 875-881. Simpson, G.B. (1981). Meaning dominance and semantic context in the processing of lexical ambiguity. Journal of Verbal Learning and Verbal Behavior, 20, 120-136. Simpson, G.B. (1984). Lexical ambiguity and its role in models of word recognition. Psychological Bulletin, 96, 316-340. Simpson, G.B. (1994). Context and the processing of ambiguous words. In M.A. Gernsbacher (Ed.), Handbook of Psycholinguistics (pp. 359-374). San Diego: Academic Press. Simpson, G.B., & Burgess, C. (1985). Activation and selection processes in the recognition of ambiguous words. Journal of Experimental Psychology: Human Perception and Performance, 11, 28-39. Small, S.I., Cottrell, G.W., & Tanenhaus, M.K. (1988). Lexical ambiguity resolution: Perspectives from psycholinguistics, neuropsychology, and artificial intelligence. San Mateo, CA: Morgan Kaufmann Publishers. Swets, B., Desmet, T., Clifton, C., & Ferreira, F. (2008). Underspecification of syntactic ambiguities: Evidence from self-paced reading. Memory and Cognition, 36, 201-216. 49

Swinney, D.A. (1979). Lexical access during sentence comprehension: (Re)consideration of context effects. Journal of Verbal Learning and Verbal Behavior, 18, 645-659. Tabossi, P., Colombo, L., & Job, R. (1987). Accessing lexical ambiguity: Effects of context and dominance. Psychological Research, 49, 161-167. Tabossi, P. (1988). Accessing lexical ambiguity in different types of sentential contexts. Journal of Memory and Language, 27, 324-340. Tanenhaus, M.K., Leiman, J.M., & Seidenberg, M.S. (1979). Evidence for multiple stages in the processing of ambiguous words in syntactic contexts. Journal of Verbal Learning and Verbal Behavior, 18, 427-440. Tanenhaus, M.K., & Lucas, M.M. (1987). Context effects in lexical processing. Cognition, 25, 213-234. Twilley, L.C., Dixon, P., Taylor, D., & Clark, K. (1994). University of Alberta norms of relative meaning frequency for 566 homographs. Memory and Cognition, 22, 111-126. Van Petten, C., & Kutas, M. (1987). Ambiguous words in context: An event-related potential analysis of the time course of meaning activation. Journal of Memory and Language, 26, 188-208. Witzel, N., Witzel, J., & Forster, K.I. (2012). Comparisons of online reading paradigms: Eye tracking, moving-window, and maze. Journal of Psycholinguistic Research, 41, 105-128. Wotschack, C., & Kliegl, R. (in press). Reading strategy modulates parafoveal-on-foveal effects in sentence reading. Quarterly Journal of Experimental Psychology.

50

Appendix A. Items in Experiments 1, 2, and 5. (The italicized words were added for Experiments 2 and 5.) (1) The umpire/nurse tried to swallow the bat but its handle got stuck in his/her throat. The umpire/nurse tried to swallow the bat but its wings got stuck in his/her throat. (2) The arrow/fence punished the bow because its aim was not straight enough. The arrow/fence punished the bow because its knot untied much too easily. (3) The cloak/stove scolded the cape’s long cloth because it was not thick enough. The cloak/stove scolded the cape’s long beaches because they were not nice enough. (4) The poker player/taxi driver blew up the cards because the dealer was cheating very badly. The poker player/taxi driver blew up the cards because this birthday mail arrived too late. (5) The lawyer/pilot complimented the case to sway the jury several weeks ago. The lawyer/pilot complimented the case to smash its lock several weeks ago. (6) The plaster/metal jumped on the cast and the leg broke under its weight. The plaster/metal jumped on the cast and the actors groaned under its weight. (7) The biologist/runner crashed into the cell but its thick membrane did not break easily. The biologist/runner crashed into the cell but its steel bars did not break easily. (8) The caveman/toddler researched the club and found its handle to be quite sturdy. The caveman/toddler researched the club and found its drinks to be quite cheap. (9) The treasure/painting blessed the chest of gleaming gold coins that the pirate buried. The treasure/painting blessed the chest with deep bleeding cuts that the surgeon healed. (10) The corporation CEO/secret agent boiled the board until it raised salaries for the workers. The corporation CEO/secret agent boiled the board until its wood split down the middle. (11) The expensive stock/wine belittled the bond because the yield was not very high. The expensive stock/wine belittled the bond because the glue was not very strong. (12) The vacation/dictionary avoided the trip because more travel would have been expensive. The vacation/dictionary avoided the trip because another injury would have been painful. (13) The common/lonely cat sang that the dog was rare but it was ordinary to all the pigs. The common/lonely cat sang that the dog was rare but it was overcooked to all the pigs. (14) The clock/chair ignored the tick when attending to the time very carefully each day. The clock/chair ignored the tick when attending to other bugs very carefully each day. (15) The amplifier/flashlight attempted to bite the speaker but its plastic was just too hard. The amplifier/flashlight attempted to bite the speaker but his podium was too far away. (16) The envelope/backpack lectured to the seal but the glue was not listening at all. The envelope/backpack lectured to the seal but the animal was not listening at all. (17) The catcher/soldier wanted to melt the pitcher but his curveball was good so he didn’t. The catcher/soldier wanted to melt the pitcher but its glass was thick so he didn’t. (18) The faucet/mirror disdained the pipe because its leaking water was very bothersome. The faucet/mirror disdained the pipe because its foul smoke was very bothersome. (19) The piano/shelves forgot to bring the organ so the concert was canceled yesterday. The piano/shelves forgot to bring the organ so the transplant was canceled yesterday. (20) The tennis ball/canned fruit praised the court because its surface was perfectly smooth. The tennis ball/canned fruit praised the court because its judgments were perfectly correct. (21) The hammer/textbook criticized the nail and the wood it was in without any mercy.

51

The hammer/textbook criticized the nail and the finger it was on without any mercy. (22) The toy/desk recommended the model because its parts were easy to assemble. The toy/desk recommended the model because her face was so very beautiful. (23) The judge/hiker jumped around the panel because its members were famous and powerful. The judge/hiker jumped around the panel because its wood was beautiful and strong. (24) The bomb/shrub flirted with the fuse before it was ignited by terrorists yesterday. The bomb/shrub flirted with the fuse before it shorted out quite suddenly yesterday. (25) The freckle/brick argued with the mole so the doctor removed it from the skin. The freckle/brick argued with the mole so the exterminator removed it from its hole. (26) The equipment/staircase hated the new gear because the hiker preferred it when camping. The equipment/staircase hated the new gear because the car’s transmission was still awful. (27) The past/wisdom enjoyed the present but thought the future would be nicer in the end. The past/wisdom enjoyed the present but thought its ribbon was very ugly in the end. (28) The eye/frog forgot about the pupil so its dilation came as a wonderful surprise. The eye/frog forgot about the pupil so his graduation came as a wonderful surprise. (29) The church/willow pacified the temple so its worshippers were allowed to return. The church/willow pacified the temple so the headache went away very quickly. (30) The stone/sofa could not stand the rock in the pile sitting just next to it. The stone/sofa could not stand the rock by the musician playing just next to it. (31) The winter/park reminisced about the fall that the hot summer ruined two years ago. The winter/park reminisced about the fall that the old lady suffered two years ago. (32) The projector/magazine worried about the screen because the movie theater was too unsafe. The projector/magazine worried about the screen because the window frame was too rusty. (33) The sergeant/secretary shredded the drill because the training had failed miserably. The sergeant/secretary shredded the drill because its batteries had failed instantly. (34) The loan/police officer mocked my interest from my investment due to my small profit. The loan/police officer mocked my interest in my hobby because he thought it boring. (35) The weight/wit tried to wrestle the mass but the object was simply too large. The weight/wit tried to wrestle the mass but the priest was simply too strong. (36) The contract/mattress fretted about the term but the provision turned out to be OK. The contract/mattress fretted about the term but the semester turned out to be OK. (37) The thunder/vacuum ran away from the bolt but the lightning eventually caught up to it. The thunder/vacuum ran away from the bolt but the screw eventually caught up to it. (38) The coma/freezer avoided the stroke because illnesses like that frightened it. The coma/freezer avoided the stroke because swimming like that frightened it. (39) The pulley/purse failed to identify the crane because its cables were old and broken. The pulley/purse failed to identify the crane because its feathers were red and blue. (40) The waiter/swimmer yelled angrily at the tip from the customers but not for very long. The waiter/swimmer yelled angrily at the tip of his finger but not at its base. (41) The steel/carpet worried about the iron because it was mined in a dangerous place. The steel/carpet worried about the iron because its long cord often got tangled up. (42) The weeping/singing congratulated the tear because the drop was big and nicely formed. The weeping/singing congratulated the tear because the rip was big and nicely formed. (43) The crazy/pretty woman celebrated the nuts for their insanity and their rowdiness. 52

The crazy/pretty woman celebrated the nuts for their saltiness and their sweetness. (44) The calendar/pencil disparaged its date because that day was uneventful and boring. The calendar/pencil disparaged its date because her dress was unattractive and boring. (45) The goose called them strange/complex and odd but the children were normal in every way. The goose called them strange/complex and odd but the numbers were even in the end. (46) The statue/camera adored the red marble on the cathedral walls that the priest cleaned. The statue/camera adored the red marble and the purple ones that the boys tossed. (47) The ballet/notebook despised the tap and jazz dance instructor because she was clumsy. The ballet/notebook despised the tap and bottled water served at the nice restaurant. (48) The joke/time tried to disregard the gag but the funny story could not be ignored. The joke/time tried to disregard the gag but the loud choking could not be ignored.

53

Appendix B. Relative meaning frequency (RMF) values for the ambiguous words. Item # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40

Ambiguous Word bat bow cape card case cast cell club chest board bond trip rare tick speaker seal pitcher pipe organ court nail model panel fuse mole gear present pupil temple rock fall screen drill interest mass term bolt stroke crane tip

RMF value for the meaning in congruent sentences 0.53 0.35 0.52 0.42 0.47 0.34 0.40 0.23 0.27 0.07 0.23 0.65 0.46 0.39 0.31 0.40 0.48 0.33 0.47 0.34 0.58 0.39 0.36 0.22 0.27 0.31 0.24 0.33 0.62 0.26 0.36 0.61 0.26 0.38 0.36 0.43 0.24 0.29 0.44 0.29 54

RMF value for the meaning in incongruent sentences 0.34 0.26 0.41 0.47 0.38 0.27 0.48 0.17 0.51 0.70 0.30 0.30 0.36 0.47 0.58 0.50 0.49 0.51 0.50 0.50 0.35 0.56 0.51 0.48 0.49 0.55 0.53 0.67 0.31 0.46 0.55 0.30 0.59 0.48 0.24 0.23 0.46 0.61 0.33 0.46

41 42 43 44 45 46 47 48

iron tear nut date odd marble tap gag

0.49 0.32 0.30 0.40 0.51 0.51 0.51 0.27 M =.38 SD =.12

55

0.47 0.61 0.48 0.43 0.46 0.36 0.38 0.49 M =.44 SD =.12

Appendix C. Co-occurrence correlations for the biasing and control words with each ambiguous word. Item # 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37

Ambiguous Word bat bow cape cards case cast cell club chest board bond trip rare tick speaker seal pitcher pipe organ court nail model panel fuse mole gear present pupil temple rock fall screen drill interest mass term bolt

Biasing Word umpire arrow cloak poker lawyer plaster biologist caveman treasure CEO stock vacation common clock amplifier envelope catcher faucet piano tennis hammer toy judge bomb freckle equipment past eye church stone winter projector sergeant loan weight contract thunder

Biasing / Ambiguous Correlation 0.19 0.34 0.19 0.10 0.14 0.13 0.11 0.13 0.12 0.13 0.19 0.38 0.25 0.23 0.33 0.21 0.39 0.15 0.36 0.10 0.57 0.10 0.16 0.12 0.15 0.33 0.18 0.16 0.19 0.20 0.17 0.31 0.27 0.20 0.20 0.08 0.25 56

Control Word nurse fence stove taxi pilot metal runner toddler painting agent wine dictionary lonely chair flashlight backpack soldier mirror shelves canned textbook desk hiker shrub brick staircase wisdom frog willow sofa park magazine secretary police wit mattress vacuum

Control / Ambiguous Correlation -0.02 -0.02 -0.04 -0.01 0.02 0.02 -0.01 -0.01 -0.02 -0.08 0.02 -0.01 0.00 -0.05 -0.06 -0.04 0.02 0.00 -0.02 -0.01 -0.01 -0.01 0.00 -0.01 0.00 -0.01 0.01 -0.01 -0.01 0.00 -0.05 -0.03 0.01 -0.02 -0.02 -0.01 -0.12

38 39 40 41 42 43 44 45 46 47 48

stroke crane tip iron tear nuts date odd marble tap gag

coma pulley waiter steel weeping crazy calendar strange statue ballet joke

0.20 0.11 0.10 0.36 0.14 0.44 0.17 0.68 0.26 0.19 0.20 M = 0.22 SD = 0.12

57

freezer purse swimmer carpet singing pretty pencil complex camera notebook time

-0.01 -0.02 0.00 -0.04 0.00 0.01 -0.03 0.01 -0.01 -0.01 -0.02 M = -0.02 SD = 0.03

Table 1. Mean response times (in milliseconds) over the component frames of grammatical and scrambled fillers (Experiment 1). Grammatical Fillers

Scrambled Fillers

753

787

58

Table 2. Mean response times (in milliseconds) for the ambiguous word, disambiguating word, and disambiguating region in experimental sentences (Experiment 1). Biased Sentences

Control sentences

Garden Path Effect

Congr (a)

Incong (b)

Diff

Cong (c)

Incong (d)

Diff

798

792

--

807

826

--

--

780

805

25

797

801

4

21

758

768

10

794

783

-11

21

ambiguous word (bat) disambiguating word (handle / wings) disambiguating region (handle got stuck / wings got stuck)

59

Table 3. Mean response times (in milliseconds) for the ambiguous word, disambiguating word, and disambiguating region in experimental sentences (Experiment 2). Biased Sentences

ambiguous word

Control sentences

Garden Path Effect

Cong (a)

Incong (b)

Diff

Cong (c)

Incong (d)

Diff

898

890

--

962

958

--

--

907

999

92

1012

960

-52

144

976

1029

53

1010

1003

-7

60

(bat) disambiguating word (handle / wings) disambiguating region (handle got stuck / wings got stuck)

60

Table 4. Mean response times (in milliseconds) for the ambiguous word, disambiguating word, and disambiguating region in experimental sentences (Experiment 4). Biased Sentences

ambiguous word

Control sentences

Garden Path Effect

Cong (a)

Incong (b)

Diff

Cong (c)

Incong (d)

Diff

981

971

--

955

942

--

--

995

975

-20

1009

985

-24

4

1037

1010

-27

1033

1003

-30

3

(bat) disambiguating word (handle / wings) disambiguating region (handle got stuck / wings got stuck)

61

Table 5. Mean reading times (in milliseconds) and regression and skipping rates (as proportions) for the regions of interest in experimental sentences (Experiment 5).

Biased-Congruent Biased-Incongruent Control-Congruent Control-Incongruent

Region 1

Region 2

Region 3

Region 4

Region 5

the bat

but its

handle

got stuck

in his throat.

the bat

but its

wings

got stuck

in his throat.

the bat

but its

handle

got stuck

in her throat.

the bat

but its

wings

got stuck

in her throat.

1st fixation duration Biased-Congruent Biased-Incongruent Control-Congruent Control-Incongruent

230 237 239 241

235 239 247 235

236 246 240 247

241 242 250 241

gaze duration Biased-Congruent Biased-Incongruent Control-Congruent Control-Incongruent

377 369 378 372

370 378 375 371

280 295 283 294

351 353 373 352

451 469 467 445

go-past time Biased-Congruent Biased-Incongruent Control-Congruent Control-Incongruent

433 438 437 435

431 441 446 474

341 340 331 353

399 423 443 412

469 482 464 459

regression rate Biased-Congruent Biased-Incongruent Control-Congruent Control-Incongruent

.10 .13 .11 .12

.11 .10 .12 .16

.17 .12 .14 .15

.11 .13 .13 .11

skipping rate Biased-Congruent Biased-Incongruent Control-Congruent Control-Incongruent

.06 .06 .03 .06

.05 .05 .06 .06

.13 .12 .10 .10

.06 .04 .06 .04

.02 .03 .03 .03

total RT Biased-Congruent Biased-Incongruent Control-Congruent Control-Incongruent

498 486 499 495

480 485 495 498

327 365 340 357

453 466 488 467

544 575 550 527

The umpire tried to swallow The umpire tried to swallow The nurse tried to swallow The nurse tried to swallow

62

Table 6. Results of the linear mixed-effects models for the regions of interest (Experiment 5). Region 1

Region 2

Region 3

Region 4

Region 5

ambiguous word

ambiguous word + 1

disambiguating word

disambiguating word + 1

disambiguating word + 2

t = 1.66 p = .097

t = 1.96 p = .050 t = 2.15 p = .032 t = 2.02 p = .043

t