METHODOLOGICAL ISSUES IN THE STUDY OF THE ...

2 downloads 0 Views 456KB Size Report
Virginia, and David England and David Levenbach for letting us survey their. Arkansas State students. The first author received a Rhodes college faculty.
International Journal of Mass Emergencies and Disasters November 1993, Vol. 11, No.3, pp. 437-452.

.

METHODOLOGICAL ISSUES IN THE STUDY OF THE BROWNING PREDICTION Methodological Issues In Studying Response to the

Browning Prediction of a New Madrid Earthquake:

A Researcher's Cautionary Tale"

Christopher G. Wetzel, Edward Hettinger, Robert McMillan,

Monroe Rayburn, and Andrew Nix

RhodesCollege

Memphis, TN 38112

We examined five methodological issues which could contaminate research onpeople's reactions totheBrowning quakeprediction: sample biases, self-selection artifacts, historical event artifacts, self-report in­ consistency over time, and reactive testing effects. We found some evi­ dence for self-selection biases in mail survey return rates, especially during the one-weekperiodbefore thepredictedquake. People whowere threatened by theprediction were less likely to complete these surveys. There wasa self-selection artifact associated withthe TVmovie "TheBig One" and with the "Unsolved Mysteries" episode on Browning. These two shows attracted people who were already concerned about quakes and believed the prediction. After the prediction was disconfirmed, a control group of participants showed large declines in the perceived likelihood offuture quakes, suggesting thateithersomehistorical artifact or simple knowledge that a scientist's prediction was wrong caused a disillusionment or "cry wolf" effect. People who lived near the New Madridfault alsomanifested thisdecline, butit wassignificantly smaller than the one for the control group. Consistency of self-reports about preparation for thepredictedearthquake and the consistency of self-re­ ports about watching the "Big One" and the "Unsolved Mysteries" episode wereveryhigh. Wefound nopretestsensitization effects, but we did uncover an unusual reactivity effectfor surveys given afterthefailed prediction. We compared people surveyed immediately after thepredic­

*

We want to thank Marsha Walton for her comments on earlier versions of this manuscript, the members of the Rhodes College community for their cooperatlon In completlng our surveys, Arch Johnston and Jill Stevens for their assistance In designing the survey, MaryValnes for her help In collectlng the Corning sample, Sara Hodges and Tim Wilson for their help In collecting data at the University of Virginia, and David England and David Levenbach for letting us survey their Arkansas State students. The first author received a Rhodes college faculty development grant to conduct thisresearch.

· Wetzel et 81.: Methodological Issues

439

the damage zone of any New Madrid quake and unlikely to have taken protective action. The third issue involved the consistency of self-reports about what actions were taken for the predicted quake. People's memories decayover time,as do motivations to misrepresent theirbehaviors. Although we could not check their self-reported behaviors againstthe physical evidence (e.g., did people who said they strapped in their water heater actually have it strapped in), we did determine whether people's self-reports about what they did to get ready for the predicted quake remained consistent over the six-weekperiod immediately following the failed prediction. When researchers survey participants twice, the first survey might influence responses to the second one, perhaps by sensitizing the partici­ pants to what is expected of themon the second surveyor by makingthem react differently to eventswhichhappen betweenthe pre- and post-surveys (Lana 1969). Some people may have given the Browning prediction little thought until they had to complete a pre-prediction survey asking for their reactions. By comparing post-prediction responses of people who did and did not fill out earliersurveys, we can assessthe reactivity of completing a survey. Method

Participants The mainsampleconsisted of approximately 442 students, faculty, and staff of Rhodes College, a liberal arts college located in Memphis, around 30 miles southeast of the southern portion of the New Madrid fault. Participants completed the surveys anonymously, using only the last four digitsof theirsocialsecurity number as an identification code. We also had a control group of 215 students taking introductory social psychology classesat the University of Virginia. For some analyses, we had additional data gathered from 218 students at Memphis State and Arkansas State Universities, 48 adults from Memphis and, 70 adults who lived nearer the fault in Corning, Trumann, or Jonesboro, Arkansas. Because these addi­ tionalsampleswerenotdesigned tobe representative of theircorresponding populations and were not randomly assigned to surveyconditions (as were the Rhodes participants), their data were employed only when it was not criticalto assume thatdifferent wavesof surveydateswereinitially equiva­ lent or when sample representativeness was not critical.

• Wetzel et 81.: Methodological Issues

441

they did not know the survey topic when they agreed to participate, and students rarely refuse to do an experimental task once they commit). The return ratesfor theremote-pre, immediate-pre, immediate-post, andremote­ post survey dates were 85.0, 64.6, 76.4, and 78.0%. Because the two post-prediction rates were nearly identical and were based on smaller numbers thanthe otherrates, they werecombined into a singlepost-predic­ tion rate. Tests of proportions showed that the immediate-pre rate was significantly lower than the remote-pre rate, Z = 3.31, P < .001, and the post-prediction rate, Z = 1.97, P < .05. Thus, people were less willing to complete the survey one weekbefore the predicted quake than they were a month before the dreaded date or one to seven weeks after the failed prediction. Faculty and staff response ratesshowed similar declines during this immediate-pre survey period. There are a number of possible causes for the reluctance to complete thesurvey during the immediate-pre period. Perhaps someevent(unrelated to the quake) distracted participants. However, the authors are unaware of any unusual events or duties which occurred during that period and which could lower both faculty and staff response simultaneously. The lower return ratemay reflecta "defensive avoidance" response. As the quake date loomednear,peoplemaynothavewanted to thinkaboutthe quakebecause it would have heightened their anxiety. If so, our sample would be biased in the skeptical direction because the believers in the prediction would be most anxious and hence under-represented. A thirdpossibility is that those skeptical about the prediction 'became so disgusted with the hype and hysteria associated withthe prediction that they wanted nothing to do with the survey. This would produce a sample with believers over-represented. We have anecdotal evidence for both types of reactions to the survey, but we have no way of determining which type of response was dominant. However, given that both reactions occurred (possibly canceling each other), and giventhe high response rate even in the immediate-pre period, the amount of bias in our sample is probably quitesmall. Whether it would be larger in otherstudies or with phone interview collection methods is an important issuefor researchers to consider. A second formof sample biascanbe detected whenexamining pre-post changescores. Of the 374 people we planned to survey twice, 275 (73.5%) completed both surveys. We wondered whether thosewho responded only once held different views about the quake than did those who responded twice (two-timers). We compared the pre-prediction responses of two-tim­ erswiththepre-prediction responses of the 56 people whoresponded toour pre-survey but not to the post survey. This analysis yielded no significant differences between dropouts and two-timers.

Wetzel et al.: Methodological Issues

443

We correlated media viewing reports on the post-prediction surveys with whether they checked on the remote-pre survey that the prediction would come true. Since the remote-pre survey was completed prior to the TV programs "The Big One" and the "Unsolved Mysteries" segment on Browning, there would be no significant chi-square unless a self-selection factor is operating. In fact, the chi-squares were significant for "The Big One," X2(1) =5.23,P < .022, andfor "Unsolved Mysteries," X2(1) =5.06, P < .024.Believers in theprediction were morelikely tohave watched these showsthan were skeptics, 37.2vs 20.8% for"The Big One," and 16.3% vs 5.8% for "Unsolved Mysteries." Another way to view these relationships is to examine inverse conditional probabilities: 41.2% of the eventual viewers of "Unsolved Mysteries" checked that the prediction would come true weeks before the segment aired, but only 18.1 % of non-viewers had thought the prediction would come true. For the "Big One," 30.8% of the viewers versus 16.5% of the non-viewers had thought that the prediction would come true. The second way to assess self-selection is to divide the post-surveys intoviewers versus non-viewers of eachmedia eventand to compare their remote-pre survey responses (completed before the programs were aired). Both programs had similar self-selection biases. When compared to non­ viewers andexpressed as a point-biserial correlation (df = 181), "Unsolved Mysteries" and"TheBigOne"viewers worried more about quakes, r = .253 and .174, disagreed more with the statement that earthquake risk was exaggerated, r = .-218and-.222, andbelieved that defensive actions (such asleaving town, skipping work, orwithdrawing extracash) weremorewise, r = .131 and .163. Viewers of "Unsolved Mysteries" believed that the chances of a quake occurring in the next 10 years was higher than did non-viewers, r = .156, and viewers of "The Big One" believed that the chances of a quake occurring in early December were higher than did the non-viewers, r = .144. In addition, viewers of "The Big One" talked more withpeersabout quakes (r = .151) and estimated thata larger percentage of theirpeersbelieved in the prediction, r = .205, than did the non-viewers, all before they even watched the program. There are a variety of possible causes for this artifact. A cognitive dissonance interpretation would be thatthe viewers watched the programs (which wereadvertised as supporting theprediction) as a means of validat­ ingorverifying theirbeliefs. Another possibility is thatviewers werepeople who watched a lot of television and hence were more likely to catch these twoshows. Since highvolume television viewers differ in a variety of ways from infrequent viewers (Bower 1985; Conway and Rubin 1991), these differences may produce the effects we uncovered on the remote-pre

Wetzel at 81.: Methodological Issues

445

Historical Artifacts

In orderto detect theeffects of anyhistorical events on reactions to the prediction, we surveyed a student control group-students atthe University of Virginia whowerelocated outside the damage area projected for a New Madrid quake and were unlikely to be exposed to as much of the media circussurrounding theprediction. Theyweresurveyed ontheirbeliefs about New Madrid quakes and quakes in general during the same periods as the Memphis students. The flrst question we examined waswhether thecontrol group responses were stable over time. If not, we then wanted to know whether the control group changes were comparable to those for the New Madrid area respondents. Wetested whether responses wereconstant across the two pre-survey dates and across the two post-survey dates (both be­ tween-groups analyses), and whether responses were constant within the pre-to-post-survey dates (within-subjects, repeated-measures analysis). Across the two pre-survey and the two post-survey dates there were no significant differences, indicating that responses were stable from late October to late November, and from the second week of December to the last week of January. However, there were significant changes during the pre-to-post prediction period. Beliefin the chances of a NewMadrid quake during the next ten years (assessed on a 7-point scale) declined from M =4.76 at the pre-survey to M =4.08, on the immediate-post survey, F(I,52) =25.5, P < .001. The estimated probability of a quake within the next ten yearsalso declined from a pre-survey valueof 7.55 (equivalent to a .35 probability) to an immediate-post value of 6.14 (equivalent to a .21 probability), F(1,52) = 7.56, P < .001. Finally the percentage of peers believing in the prediction was estimated to be 43.9% before December third, but it was retrospectively re-estimated on the immediate-post (3-7 dayslatter) to be only 29.4%, F =15.80, P < .001. The actual percentage of thecontrol groupsample whobelieved theprediction before December third was 36%, and, after December third, their retrospective self-reported pre­ prediction beliefrate was only 22%. Thus, between thepre-survey dates andtheimmediate-post survey date 1-5 weeks later, belief in the long-term chances of a New Madrid quake declined as didthe beliefthatothers andtheselfhadoriginally expected the prediction to come true. The exact cause of these declines is difficult to determine. Specifying what non-quake related events could cause such specific effects in a suchshorttime is beyond the creativity of the authors. We suspect that the knowledge of the failed prediction (which did receive national press) accounts for thedeclines through a disillusionment or "cry wolf' effect. The students might thinkthat since a scientist'sprediction of

Wetzel 8t at: Methodological Issues

447

Although consistency wasfairly high, it isimportant todetermine which of twotypesof inconsistencies is mostprevalent: initially reporting having watched a program and then denying later that one had seen it (false negatives) versus initially reporting having not seen a program and then laterclaimed to have viewed it (false positives). Inclassifying theself-report inconsistencies this way, we are assuming that the initial report is most accurate because it occurred closest in time to the actual event. The two typesof errorshave different consequences for researchers who are trying to estimate the actual viewing rateof media events, andthey havedifferent implications for how people reacted to the prediction. Researchers who ignore false positives will over-estimates media viewing rates, and those who ignore false negatives willunder-estimate viewing rates. False nega­ tiveswould occurmoreoften thanfalse positives if people were motivated, afterthe prediction failed, to appear as having beenunconcerned about the prediction. If people heard about the media events because they were discussed by associates, or because they talked withothers about the media hype concerning the prediction, they may confuse what they heard about with whatthey actually saw. Thus, false positives would occur more often thanfalse negatives, as an"honest mistake" as opposed tothe needto down play one's concern about the prediction. In order to determine whether there was a stronger tendency on the second survey to falsely report viewing a media event than there was to deny seeing one,wesummed eachtypeof erroron the media questions with false positives being coded a +1 and false negatives a -1. We omitted the newspaper story item because stories about quakes and Browning did appear between the dates we surveyed. Tests of the grand mean for the Rhodes sample sumyielded a significant positive recall bias,t(141) = 3.05, P < .006. Thus false positives were more common than false negatives on the second survey. People were more likely to first report not viewing the showsandlaterto report seeing them thantheywereto first report viewing and later to deny watching them.' Given this recall bias, we examined whether the degree of biasvaried according to the amount of time between survey waves and the subsample composition. A3 by3 ANOVA (Rhodes students, faculty, andstaffbythree different combinations of survey waves) demonstrated that three survey wave groups varied in the degree of bias, F(2, 133) = 4.90, P < .009, (see Table 1),andthe three sample groups marginally differed in theirdegree of bias, F(2,133) = 2.96, P < .055. The degree of recall bias for the Rhodes staff, students and faculty were .32, .17, and .00, with only the faculty's mean being non-significantly different from zero. These means indicate that, for example, that the staff was 32% more likely to commit a false

Wetzel et al.: Methodological Issues

449

between the second and the first survey does increase recall bias. This suggests that memory decay can play a role, provided there is no overt response immediately afterthefailed prediction (as there was in the imme­ diate-post, remote-post group) to terminate the distortion process. Thus, it appears as if the prediction's disconfirmation initially createdthe positive reporting bias, which then may have been exacerbated by memory decay. Thisinterpretation suggests thatpeople tended toover-report viewing media events whenthey wereasked afterthe prediction was disconfirmed. If this reasoning is correct, then the immediate-post, remote-post participants' responses to media viewing questions on their first post survey should be higher than the two immediate-pre group's pre survey viewing rates even though the two surveys were administered only a week apart. Random assignment to these three groups should have made their media viewing habits initially equivalent, but then the recall bias would have pulled the immediate-post rate up,whereas thetwoimmediate-pre survey rateswould not yet manifest the bias.The immediate-post rate was indeed higher than the twoimmediate-pre ratesfor "The BigOne,"TV newsspecials, and the Rhodes letter, but only the Rhodes letter difference was significant, t(1,139) = 2.07, P < .040. Given our findings, researchers who rely on retrospective self-reports of media viewing after the failed prediction may over-estimate the extent towhich themedia events were actually viewed. Furthermore, since itseems that the falsification of the prediction produced this bias, researchers who makeviewer/non-viewer comparisons may inadvertently attribute viewing effects on their dependent variables to the media programs themselves instead of to the effectof the failed prediction on memory error. For theimmediate-post, remote-post group we alsochecked theconsis­ tency of reports about earthquake preparation activities. Participants checked which of 13 activities (such as purchasing insurance or planning escape routes frombuildings) theyhaddone in preparation forthepredicted quake, and theyreported in whatwaysthey altered theirnormal routine on or around December third. Consistency of self-reports wereabove 87% for all behaviors except "reading a safety pamphlet," which had a consistency rate of 70%. There was no significant false positive recall bias on the summed preparation activities, indicating little tendency to distort memo­ ries about quake preparation, at least from the period immediately afterthe prediction failed until lateJanuary.

Wetzel et 81.: Methodological Issues

451

;

~

,

prediction before the show wasevenaired. This finding would leadjo the conclusion that the program actually decreased belief in the Browning prediction, counteracting thepre-existing differences. Researchers who find pre-post change in NewMadrid residents quake beliefs might be tempted to attribute the change to actions they took to prepare for the quake. In fact, we found that New Madrid fault area participants changed their beliefs less than the did control group partici­ pants, suggesting that their preparatory efforts may have inhibited belief change about future quakes. Finally, long-term follow-ups which survey people more than once after the failed prediction may give the false impression that people are lessconcerned about quakes overtime. Ourdata demonstrated that it was theearlier survey itselfwhich induced the decline on the latterone. Our findings raise the question about how generalizable our artifacts would be across different populations, types of questions or dependent variables, and survey methods (phone, face-to-face, etc.). Across our dif­ ferent subsamples (including community citizens, college students, and Ph.D.'s), we found similar artifacts, as indicated by few subsample-by-ar­ tifact interactions. Thus, the artifacts are not restricted to a particular population. Given the artifacts were present across survey questions with different formats and contents, we do not think they are restricted to a particular type/style of questions we asked. The good news is that these artifacts were not very large in terms of theireffect size;they did not impact large numbers of dependent variables; and some of them do not apply to some research situations, such as studies surveying people once.Theextent to which researchers should worry about these artifacts depends on theeffectsizesof theirresearch findings (if their effects areextremely large, the artifacts we discovered are unlikely to have caused them), theirneeds for precise parameter estimates, andtheirprefer­ ence for simple interpretations of their data. The best newsis that our research revealed someinteresting processes which people use to cope witha potential disaster. These processes suggest that people actively protect themselves against threats (such as completing an immediate-pre survey), seek information which supports their beliefs (the media self-selection effect), and continue to think about the implica­ tions of an earthquake prediction long after it fails (the post-prediction survey sensitization effect). Designing a study andconducting theanalyses thatallow for detection of methodological artifacts may require substantial time and energy, but the rewards are the discovery of subtle effects which give us richer interpretations of the data.