Measuring alcohol consumption and alcohol-related problems

0 downloads 0 Views 157KB Size Report
Each telephone sample comprises n5 2500 re- ... wine and spirits into grams of ethanol the follow- ...... mates in the telephone survey, since its missing.
Addiction (2001) 96, 459–471

RESEARCH REPORT

Measuring alcohol consumption and alcohol-related problems: comparison of responses from self-administered questionnaires and telephone interviews LUDWIG KRAUS & RITA AUGUSTIN IFT Institute for Therapy Research, Munich, Germany Abstract Aims. Compared with surveys using self-administered questionnaires, telephone interviews generally yield higher coverage rates, have a lower proportion of missing values and result in fewer inconsistencies. Meta-analyses, however, show that responses to sensitive questions by telephone tend to be biased by social expectations. The aim of the study is to examine whether responses on alcohol consumption and alcohol-related problems differ with respect to mode of administration (self-administered vs. telephone). Design and participants. Data were analysed from the 1995 self-administered survey among 6427 subjects and from telephone surveys conducted annually between 1994 and 1996 yielding a pooled sample of 6193 subjects. Measurements. Alcohol consumption within the last 30 days was measured using a beverage-speciŽ c quantity–frequency index. For a summary measure responses were converted into pure alcohol (ethanol) per day and categorized into no alcohol consumption (0 g), non-hazardous consumption ( # 20 g for female and # 40 g for males) and hazardous consumption ( . 20 g for females and . 40 g for males). Alcohol-related problems were assessed using the CAGE questionnaire with a cut-off point of at least two positive responses. Findings. Using (cumulative) logistic regression, a signiŽ cant mode effect was found for both alcohol consumption and alcohol-related problems. Lower beverage-speciŽ c prevalences in the telephone mode were found to be responsible for the difference in the distribution of the summary consumption measure. Conclusions. Results indicate that patterns of drinking and alcohol-related problems are more easily reported in self-administration questionnaires compared to telephone interviews.

Introduction The German National Survey on Psychoactive Substances (NSPS), commissioned by the German Federal Ministry of Health and conducted for the Ž fth time since 1980, has always used self-administered questionnaires for research on consumption patterns and addiction (Kraus,

Bauernfeind & Bu¨hringer, 1998; Kraus & Bauernfeind, 1998). In other Ž elds telephone survey methodology using the advantages of modern computer-assisted interviewing laboratories has already become an attractive alternative to both mail and face-to-face inquiry (Frey, 1989). Compared to other survey modes, tele-

Correspondence to: Dr Ludwig Kraus, IFT Institut fu¨r Therapieforschung, Arbeitsgruppe Soziale Epidemiologie, Parzivalstr. 25, D - 80804 Mu¨nchen, Germany. Tel: 1 49 (0) 89 360804 30; fax: 1 49 (0) 89 360804 49; e-mail: [email protected] Submitted 5th January 2000; initial review completed 12th May 2000; Ž nal version accepted 6th September 2000. ISSN 0965–2140 print/ISSN 1360–0443 online/01/030459–13 Ó Carfax Publishing, Taylor & Francis Limited DOI: 10.1080/0965214002005428

Society for the Study of Addiction to Alcohol and Other Drugs

460

Ludwig Kraus & Rita Augustin

phone interviews have shorter Ž eld times, are less expensive and allow for more efŽ cient data collection. Given comparable data quality, telephone interview methods are preferable for measuring drug and alcohol consumption in national or local samples of the general population. Most previous research on mode differences have compared results from telephone questionnaires with those conducted face to face. Investigations into differences between mail and either telephone or face-to-face interview mode are far fewer (De Leeuw, 1992). The likelihood of differences between either of the two interview modes seems smaller than between mail and the two interview methods, as the latter rely on aural communication as opposed to the visual information of mail questionnaires. Comparisons of response outcome appear to corroborate this assumption. A meta-analysis of studies on differences in data quality concerning health-related behaviour (including alcohol consumption) from mail, telephone and personal interviews found that comparable conclusions can be drawn from well-conducted personal and telephone interviews (De Leeuw, 1992; De Leeuw & Collins, 1997). The most consistent Ž nding in studies on mode effects on alcohol- and drugrelated behaviour comparing both interview methods is the lack of differences (Mangione, Hingson & Barrett, 1982; Sykes & Collins, 1998; Groves, 1989; Johnson, Hougland & Clayton, 1989). On the other hand, comparisons of telephone interviews, personal interviews and selfadministered answer sheets within personal interviews showed substantial differences in favour of self-administered questionnaires. Prevalence rates of alcohol and illicit drug use were highest in personal interviews with answer sheets, lower in face-to-face interviews and lowest in telephone surveys (Aquilino & LoSciuto, 1990; Aquilino, 1992, 1994; Gfroerer & Hughes, 1992). The Ž ndings of recent studies on self-reported alcohol and drug use which compared mail to face-to-face surveys are not always consistent with the above conclusion. For example, a Swiss study, testing the effects of personal interview vs. postal questionnaire in a between-subjects design, found higher reported alcohol consumption in personal interviews (Rehm & Spuhler, 1993). Although the two methods in that study differed with regard to sampling technique, another study based on a within-subjects design yielded similar

results (Rehm, 1994). Since no sampling differences were involved in the latter study, differences of sampling techniques as a possible explanation for differences in the between-subjects design study could be ruled out, and thus differences could be attributed to assessment mode (Rehm & Arminger, 1996). A recent Dutch study also compared mail questionnaires and personal interviewing on self-reported alcohol use and problem drinking. In the betweensubjects design, the sampling frame was the same for the two assessment modes. No notable mode differences between alcohol measures were reported (Bongers & Van Oers, 1998). Relatively little research has been conducted on mode comparisons between mail and telephone surveys. A recent study using data from the Swiss Health Survey, in which respondents were Ž rst interviewed by telephone and later followed-up by a self-administered mail questionnaire, reported signiŽ cantly more drinkers, more heavy drinkers and higher volume of drinking through self-administered questionnaires (Gmel, 2000). This result supports the hypothesis that an administration mode which guarantees more privacy in responding to sensitive questions performs better. Since mail surveys offer greater anonymity, more privacy and conŽ dentiality than both interview modes, it is hypothesized that mail survey self-reports on sensitive topics show less social desirability bias (Hochstim, 1967; Sudman & Bradburn, 1974; Groves, 1990; Schwarz et al., 1991; De Leeuw, 1992). Often the presence of interviewers leads to a reluctance to reveal characteristics believed to be socially negative (De Leeuw & Van der Zouwen, 1988; Aquilino, 1994). Groves (1990) pointed out that socially desirable tendencies in interview situations are rather a function of conŽ dentiality than of social distance. In telephone interviews, however, the credibility of the researchers to guarantee conŽ dentiality is more difŽ cult to establish. The lack of theoretical concepts apart from the social desirability hypothesis as well as the focus on response differences alone has been criticized by Dillman and colleagues (1996). They point out three major differences between mail and telephone mode: Ž rst, presence or absence of interviewer; secondly, dependence on visual or aural communication; and thirdly, interviewer or respondent control of pace and information sequence. In uencing mechanisms may be the

Comparing responses from questionnaires and interviews consciousness of social norms, the context of responding (sequential vs. simultaneous availability of information), time pressure, memory limitations and cognitive processing. With the exception of research on social desirability, evidence of the existence of consistent and predictable differences in responses to mail and telephone surveys is rather sparse (Dillman & Tarnai, 1991; Tarnai & Dillman, 1992; Dillman et al., 1996). Much more evidence is provided concerning mode differences such as response validity, item non-response and similarity of responses (De Leeuw, 1992). Mode differences in non-response rates are well documented with the lowest rates in personal interviews, higher rates in telephone surveys and the highest in mail surveys (De Leeuw, 1992; Hox & De Leeuw, 1994). Some studies have found item non-response to be slightly higher in telephone than in face-to-face surveys (Groves & Kahn, 1979). Others have reported no differences between personal and telephone interviews with respect to missing data, but found both modes to be superior to mail or self-administered surveys (Hochstim, 1967; Dillman, 1978). Another crucial issue examined was access to households due to different sampling frames. The exclusion of households without telephones in telephone surveys may be a potential bias. Aquilino (1992), for instance, found higher rates of alcohol, drug and tobacco users among respondents without telephones. The literature on mode effects regarding health-related questions is consistent for personal and telephone interviews. Con icting results exist when mail surveys are compared with face-to-face interviews. According to the social desirability hypothesis, one should expect that the greatest pressure occurs in face-to-face surveys and the least in mail surveys. With outcomes on alcohol use from telephone interviews being quite similar to those from personal interviews, one would predict higher rates of alcohol use and alcohol-related problems in mail surveys. The aim of the present study is to examine differences in the effects of mail and telephone administration of a national survey of alcohol use and alcohol-related problems in the German general population. Methods Samples The data for the present analysis come from four

461

cross-sectional national surveys of the German general population conducted between 1994 and 1996. The survey samples were designed to represent the German-speaking population aged 18–59 years living in private households. The telephone samples differed from the mail sample by representing only households with phone service. The self-administered questionnaire (SAQ) was delivered by an interviewer and collected later or was sent back by the respondent by mail. Sampling of the SAQ survey was based on a multi-stage probability sampling design (Kraus et al., 1998; Kraus & Bauernfeind, 1998). In the Ž rst stage constituencies were stratiŽ ed according to region and selected at random. Within selected constituencies a random sample was drawn based on a random route procedure: a point on the city map was chosen at random. Starting from this point every third household was selected. In the last stage the respondent chosen in each household was the individual with the most recent birthday. The overall response rate was 65%, resulting in 7833 respondents. The telephone sample was also derived through a multistage probability sampling design. In the Ž rst stage communities were stratiŽ ed according to region and selected at random. Within selected communities a random sample of telephone numbers proportional to population size was drawn from telephone books, where 85–90% of telephone numbers are listed (Marhenke, 1997). In the last stage the respondent interviewed in each household was the individual with the most recent birthday. Each telephone sample comprises n 5 2500 respondents. Response rates decreased slightly from 76% in 1994 to 73% and 69% in 1995 and 1996, respectively. After pooling of data, samples for both modes were almost the same size. The basic assumption was that the 2-year time-span over which the surveys were conducted was not long enough to allow major changes in consumption prevalence and alcohol-related behaviour to occur. In this sense, the three telephone surveys could be regarded as constituting one larger survey occurring over the period of 1994–96. With the mail survey conducted in 1995, data collection fell exactly in the middle of the reference period of the three telephone surveys. Nevertheless, the three datasets were compared for similarity on frequency of

462

Ludwig Kraus & Rita Augustin

responses to alcohol questions such as life-time use, consumption in the past 12 months, beverage-speciŽ c frequency and quantity and the CAGE questionnaire. With the exception of the CAGE indicator which remained nearly constant, all outcome measures decreased over time. After measures were controlled for age and gender, this tendency was still found. This systematic effect may be due to increasing non-response rates in the observation period. The pooling of the dataset was therefore considered an average with 1995 in the middle of the reference period and no major bias with regard to the relevant outcome measures was expected. A possible source of bias could be an increase from 75% to over 90% in the availability of telephone service in Eastern Germany between 1994 and 1996, 4–6 years after the reuniŽ cation. Although procuring telephone service in Eastern Germany has not depended on income status (Statistisches Bundesamt, 1997), the present data analysis is based on the sample from Western Germany and Berlin only, where availability of telephone service has remained constant. This results in a Ž nal telephone sample of n 5 6427 respondents and a Ž nal SAQ sample of n 5 6193 respondents.

Measures Alcohol consumption was measured by means of beverage-speciŽ c quantity–frequency questions. For each beverage (beer, wine/champagne, spirits) respondents were asked for quantity and frequency of consumption within the last 30 days. Consumption was measured separately for each beverage by multiplying the responses to the following two questions: (i) “During the last 30 days, on how many days did you drink beer (wine, spirits)?” (ii) “On average, on a day when you drank beer (wine, spirits), how many glasses of beer (wine, spirits) did you drink?” Responses to the last question were coded according to the most widely used units of consumption: for beer either 0.3 l or 0.5 l glasses or bottles, for wine 0.25 l glasses and for spirits either 0.02 l or 0.04 l glasses. While respondents in the mail survey could report the consumption of both small and big glasses of beer or spirits, respondents in the telephone survey had to decide on the size of the glasses. Quantities of consumed beverages were converted into pure alcohol per day and summed up. For converting volume of consumed beer,

wine and spirits into grams of ethanol the following average ethanol contents for 1 litre were used: 40 g, 92 g and 320 g ethanol, respectively. The alcohol measures in gram ethanol were recoded into three drinking status groups: no alcohol consumption (0 g), non-hazardous consumption ( # 20 g for female and # 40 g for males) and hazardous consumption ( . 20 g for females and . 40 g for males). The limits of 20 g and 40 g for males and females, respectively, are used frequently in epidemiological research (Saunders et al., 1993; Edwards et al., 1994; Gmel, 2000). Item non-response was used as a fourth category. This category also included inconsistent responses. Apart from the summary measure, past month prevalence of beer, wine and spirits as well as beverage-speciŽ c frequency, quantity per drinking day and natural logarithms of beverage-speciŽ c mean quantity per day were analysed. Alcohol-related problems were measured by the following four items of the CAGE questionnaire (MayŽ eld, McLeod & Hall, 1974; Ewing, 1984): (1) “Have you ever felt you ought to cut down on your drinking?”, (2) “Have people annoyed you by criticising your drinking?”, (3) “Have you ever felt bad or guilty about your drinking?” and (4) “Have you ever had a drink Ž rst thing in the morning to steady your nerves or get rid of a hangover?” The CAGE questions were asked with respect to an occurrence ever in life. Summary scores were calculated across responses and two or more positive answers were taken as cut-off point for the deŽ nition of alcohol abuse and dependence (Ewing, 1984). Since in the mail survey only past year drinkers had to answer the CAGE questions, the analysis of the CAGE questionnaire was restricted to this subgroup. Except for the difference in the quantity measure with regard to size of glasses wording and response categories in both questionnaires were identical. For the purpose of interviewing by telephone, a modiŽ ed short version of the selfadministered questionnaire was reformatted and programmed into a computer-assisted telephone interview (CATI).

Data analysis Data from both surveys were weighted to Ž t the bivariate distribution of age group and gender, and the distribution of community size of each

Comparing responses from questionnaires and interviews Western German federal state as well as Berlin at 31 December, 1995 (Statistisches Bundesamt, 1997). Weights were calculated with the SPSS 6.1.3 procedure GENLOG listing the iterative proportional Ž tting algorithm (Agresti, 1990). This re-weighting was necessary to exclude effects caused by different weighting algorithms of the Ž eld institutes. According to Groves (1989), there are two approaches to the analysis of mode effects: the Ž rst aims at identifying inherent properties of the aural mode which might produce differences in the survey results. The second is guided by the question if a telephone survey obtains the same results as a mail survey in spite of the differences in the way the surveys are conducted. We adopted the second approach, which has two implications in the data analysis: Ž rst, non-telephone households in the mail sample were included and secondly, the analysis was based on valid cases. To analyse whether mode of data collection has a substantial in uence, separate regression models were run using the dependent variables of consumption of beer, wine and spirits, frequency and quantity per drinking day by beverage type, alcohol consumption in grams ethanol per day and CAGE items. The independent variables included in the models were gender, age group (18–29, 30–39, 40–49, 50–59 years) and mode of administration. Two-factor interactions were also included. The independent variables were dummy-coded with the reference categories female, age group 50–59 years and telephone mode. The overall summary alcohol consumption measure was divided in the categories of no consumption, non-hazardous and hazardous drinking. We refrained from a linear regression model with the uncategorized summary measure as the proportion of missing values would be much higher: a respondent with hazardous quantities of one beverage and missing values on another type can be included in the categorical regression, but must be excluded from the linear regression. On the other hand, females reporting a mean consumption of less than 10 g ethanol per day and males reporting less than 20 g ethanol per day were categorized as non-hazardous drinkers even if their data were incomplete, thus assuming that beverages which are not reported are not consumed, or only consumed in small amounts. The categories of overall alcohol consumption,

463

frequency and quantity and glasses per drinking day were treated ordinally and a cumulative logistic regression model was applied. Responses to frequency were combined to create 12 categories for beer and wine and nine frequency categories for spirits; quantity per day was combined into seven categories for beer and six categories for both wine and spirits. Responses to the CAGE questions, and to the consumption of beer, wine and spirits were dichotomous and analysed with logistic regression. The natural log of the mean consumption of beer, wine and spirits per day was analysed with ANOVA. All regressions were calculated with SUDAAN 7.5 (Shah, Baronwell & Bieler, 1997). The cumulative logistic regression models the probabilities of, e.g. “no consumption” and “no consumption/ non-hazardous consumption” simultaneously. Estimates for the other events of “non-hazardous consumption” and “hazardous consumption” can then be derived easily. Apart from providing a parsimonious model, the cumulative model avoids the problem of multiple tests, i.e. actual signiŽ cance levels which may exceed the nominal signiŽ cance level. This problem would occur if, for example, the probabilities of “no consumption” and “hazardous consumption” would be modelled separately. Furthermore, separate models do not take into account that the dependent variable is ordinally scaled. As reference category for the dependent variables, the last category, e.g. hazardous drinking, was used in the cumulative logistic model. In the logistic regression the Ž rst category, e.g. less than two positive answers in the CAGE questionnaire, was chosen as reference category. This affects the interpretation of the beta coefŽ cients of the two models. In the logistic model, a positive beta coefŽ cient, e.g. for the main effect “mode”, indicates a higher prevalence rate of, for instance, alcohol-related problems in the mail survey, whereas a positive beta coefŽ cient in the cumulative logistic model implies, for instance, a lower level of alcohol consumption in the mail survey.

Results Mode differences in alcohol consumption In the mail survey item non-response to the questions on frequency and quantity of alcohol consumption were always higher compared to the telephone mode. In the summary measure of ethanol intake responses on frequency and quan-

464

Ludwig Kraus & Rita Augustin Table 1. Past month alcohol consumption by administration mode, age and gender Male Age (years) 18–29

30–39

40–49

50–59

Consumption Missing No consumption Non-hazardous Hazardous Missing No consumption Non-hazardous Hazardous Missing No consumption Non-hazardous Hazardous Missing No consumption Non-hazardous Hazardous

Female

Selfadministered (n 5 2.918)

Telephone (n 5 2.795)

Selfadministered (n 5 3.509)

Telephone (n 5 3.398)

6.6 15.5 64.6 13.3 4.9 9.6 68.2 17.3 7.2 11.1 62.4 19.3 7.3 12.8 60.9 19.0

0.5 16.2 77.5 5.7 1.3 14.4 76.5 7.8 1.6 15.5 70.9 12.0 0.9 15.5 71.7 12.0

7.5 26.4 57.8 8.3 6.7 20.4 62.8 10.1 9.2 20.9 55.5 14.5 10.1 23.5 55.5 11.0

0.9 30.4 63.5 5.2 1.3 25.8 66.8 6.1 0.7 23.6 66.3 9.4 0.7 33.8 57.0 8.4

tity were combined. Non-response resulted from non-response to either one of the variables or to both. Otherwise a missing value resulted from inconsistencies where respondents reported a valid frequency but a quantity of zero or vice versa. Most of the missing values for the summary measure from the mail survey and almost all from the telephone survey were due to inconsistencies. As shown in Table 1, non-response rates to overall alcohol consumption in the mail survey ranged from 4.9% (30–39-year-old males) to 10.1% (50–59-year-old females), while those in the telephone survey were much lower ranging from 0.5% (18–29-year-old males) to 1.6% (40–49-year-old males). As expected, the level of overall alcohol consumption was signiŽ cantly higher among men than among women. Both in the telephone and the mail survey the proportion of hazardous male drinkers exceeded the proportions of hazardous female drinkers in all age groups. On the other hand, in all age groups abstinence rates of males were lower than those of females. More respondents of the mail than of the telephone survey fell in the low alcohol consumption and hazardous alcohol consumption category leading to a signiŽ cant main effect “mode” in the cumulative logistic regression model. The interaction effect “mode by gender” is not signiŽ cant although each age group the observed mode effect for

prevalence rates of hazardous drinking is larger for males than for females (Table 2). The observed signiŽ cant differences between the samples lead to the question whether the lower prevalence of hazardous drinkers in the telephone interview is simply a result of the higher abstinence rate or whether beveragespeciŽ c drinking patterns of drinkers differ between the modes. Thus, beverage-speciŽ c analyses were based on drinkers only. With regard to the in uence of the assessment mode the results shown in Tables 2 and 3 resemble those for the overall summary measure. The consumption of each beverage type was reported signiŽ cantly more often in the mail than in the telephone survey. Mode differences in the consumption of beer and spirits were signiŽ cantly larger for males whereas for wine consumption the interaction effect “mode by gender” was not signiŽ cant. Clearly, more males than females reported the consumption of beer, wine and spirits in the past month, but neither age effects nor gender by age interactions were found signiŽ cant. However, separate comparisons of frequency and quantity per drinking day by drinkers of each beverage type showed a rather different pattern (Table 4). While gender remained a signiŽ cant factor with males reporting higher beverage-speciŽ c frequencies and quantities

a

0.00

0.14 0.86 0.74

0.10

0.04 0.05 0.01

0.79

0.18 0.02 0.04

0.14

0.23 0.23 0.32

2

2

2

2

0.92 0.04 0.00

0.01 0.18 0.39

2

2

0.36

2

0.00

0.02 0.71 1.00 0.42

2 0.06 0.00 2 0.15

0.25 0.71 0.55

2 0.18 0.05 0.09 0.28

0.00

0.68 0.91 0.88

1.99

2 0.05 2 0.01 2 0.02

0.38

0.00

p-value b

p-value

b

Dummy-coded variables.

Mode Mail Age (years) 18–29 30–39 40–49 Sex Male Mode 3 age Mail, 18–29 Mail, 30–39 Mail, 40–49 Mode 3 Sex Mail, Male Age 3 sex 18–29, Male 30–39, Male 40–49, Male

Variable

Beer consumption (past month drinker n 5 10.079)

Alcohol consumption (total sample n 5 12.620)

2 0.05 2 0.04 2 0.08

0.08

0.10 0.03 2 0.02

2 1.13

2 0.07 2 0.07 0.07

0.60

b

0.75 0.74 0.57

0.39

0.46 0.80 0.90

0.00

0.57 0.51 0.57

0.00

p-value

Wine consumption (past month drinker n 5 10.079)

2

2

2

2

2

2

0.09 0.05 0.15

0.34

0.08 0.02 0.30

0.27

0.11 0.08 0.20

0.62

b

0.45 0.69 0.27

0.00

0.57 0.90 0.03

0.01

0.34 0.51 0.10

0.00

p-value

Spirits consumption (past month drinker n 5 10.079)

2 0.13 2 0.29 2 0.27

2 0.14

2 0.45 2 0.09 2 0.03

1.35

0.31 0.40 0.46

0.96

b

0.53 0.13 0.17

0.28

0.02 0.62 0.88

0.00

0.16 0.05 0.04

0.00

p-value

CAGE (past 12 month drinker n 5 11.263)

Table 2. Logistic regression for consumption of beer, wine and spirits among past months drinkers, and the CAGE score among 12-month drinkersa; cumulative logistic regression for past month alcohol consumptiona

Comparing responses from questionnaires and interviews 465

466

Ludwig Kraus & Rita Augustin Table 3. Proportion of beer, wine, and spirits consumption of past month drinkers by administration mode, age and gender Male Age (years) 18–29

30–39

40–49

50–59

Consumption Beer Missing Wine Missing Spirits Missing Beer Missing Wine Missing Spirits Missing Beer Missing Wine Missing Spirits Missing Beer Missing Wine Missing Spirits Missing

Female

Self-administered (n 5 2.557)

Telephone (n 5 2.357)

Self-administered (n 5 2.716)

Telephone (n 5 2.433)

91.8 0.9 65.4 0.9 53.8 0.9 93.3 0.4 64.8 0.4 53.1 0.4 92.8 1.3 64.2 1.3 59.9 1.3 92.4 1.2 66.3 1.2 56.4 1.2

87.0 0.4 47.5 0.1 33.5 0.9 88.9 0.6 46.5 0.4 29.4 0.8 86.2 0.5 50.6 0.6 33.2 0.4 88.9 0.4 48.8 1.3 31.4 1.3

54.0 2.5 83.7 2.5 41.3 2.5 61.5 2.4 82.0 2.4 37.2 2.4 61.5 2.0 85.1 2.0 44.1 2.0 59.4 4.0 80.9 4.0 38.1 4.0

51.5 0.5 74.0 0.1 28.9 0.6 51.0 0.5 74.8 0.2 26.5 0.4 52.1 0.8 75.5 0.4 21.5 0.4 51.6 0.4 76.5 0.2 28.6 0.2

compared to females, mode was no longer signiŽ cant. Compared to the oldest age group the frequencies of each beverage type were signiŽ cantly lower in the younger age group, whereas quantities were signiŽ cantly higher. Except for quantity of spirits consumption no signiŽ cant interaction effects “mode by age” were found. Only for wine consumption a signiŽ cant interaction effect “gender by age” was observed. Additional comparisons of the natural logarithms of mean overall quantity of beer, wine and spirits consumption per day produced similar results. Males reported signiŽ cantly larger quantities than females, but no signiŽ cant “mode” and “mode by age” effects were found (data not shown).

Mode differences in alcohol-related problems While in the telephone interviews nearly all respondents who reported consumption in the last 12 months answered the CAGE questions ap-

propriately, item non-response in the mail survey ranged from 4% (30–39-year-old males) to 10.8% (50–59-year-old females). In all groups notably more females than males failed to answer these questions. There were, however, only minor differences between the age groups (Table 5). Surprisingly, despite the fact that the CAGE items re ect life-time prevalence of alcoholrelated problems, younger age groups sometimes exhibited higher prevalence rates than older groups. The highest prevalence rates were always found in the age group 40–49 years. Again, signiŽ cantly more males than females reported alcohol problems and prevalence rates were signiŽ cantly higher in the mail than in the telephone survey (Table 3). Neither the interaction effect “mode by gender” nor “age by gender” were signiŽ cant. There was, however, one signiŽ cant interaction effect between age group and mode. In the youngest age group prevalence rates between self-administered questionnaires and telephone interviews differed less than in other age groups.

a

0.12

0.00 0.02 0.72

0.00

0.76 0.76 0.66

0.02

0.95 0.46 0.73

0.5 0.3 2 0.0

2 1.0

0.0 2 0.0 0.1

2 0.2

2 0.0 2 0.1 2 0.0

p-value

2 0.2

b

Dummy-coded variables.

Mode Mail Age (years) 18–29 30–39 40–49 Sex Male Mode 3 age Mail, 18–29 Mail, 30–39 Mail, 40–49 Mode 3 sex Mail, male Age 3 sex 18–29, Male 30–39, Male 40–49, Male

Variable

Beer drinker (n 5 7.324)

0.5 0.4 0.4

0.0

0.1 0.1 0.1

2

2

0.3

0.4 0.1 0.1

0.1

2

2

2

b

0.00 0.01 0.01

0.67

0.41 0.65 0.41

0.00

0.00 0.44 0.36

0.61

p-value

Wine drinker (n 5 6.897)

2

2

2

2

2

0.2 0.0 0.1

0.2

0.2 0.2 0.1

0.5

0.7 0.7 0.3

0.1

b

0.28 1.00 0.73

0.11

0.23 0.25 0.54

0.00

0.00 0.00 0.01

0.47

p-value

Spirits drinker (n 5 3.905)

Frequency (past month)

0.3 0.1 0.2

2 0.3

0.1 0.1 0.0

0.02 0.40 0.09

0.00

0.42 0.25 0.91

0.00

0.00 0.00 0.00

2 0.8 2 0.5 2 0.5 2 1.6

0.54

p-value 0.0

b

Beer drinker (n 5 7.324)

0.1

0.4 0.3 0.1

0.1

2 0.1 0.1 2 0.0

2 0.4

2 0.3 2 0.1 2 0.3

b

0.01 0.04 0.50

0.25

0.69 0.62 0.96

0.00

0.03 0.24 0.03

0.65

p-value

Wine drinker (n 5 6.897)

Quantity (past month)

0.2 0.2 2 0.1

2 0.3

0.4 0.4 0.2

2 0.5

2 1.0 2 0.7 2 0.2

2 0.0

b

0.18 0.22 0.69

0.02

0.06 0.04 0.31

0.00

0.00 0.00 0.27

0.80

p-value

Spirits drinker (n 5 3905)

Table 4. Cumulative logistic regression for past month frequency of beer, wine and spirits consumptiona, and quanity per drinking day for beer, wine, and spiritsa

Comparing responses from questionnaires and interviews 467

468

Ludwig Kraus & Rita Augustin

Table 5. Alcohol-related problems (CAGE) of past 12-month drinkers by administration mode, age and gender Male Age (years) 18–29 30–39 40–49 50–59

CAGE Missing One or none Two or more Missing One or none Two or more Missing One or none Two or more Missing One or none Two or more

Female

Self-administered (n 5 2.568)

Telephone (n 5 2.524)

Self-administered (n 5 2.849)

Telephone (n 5 2.877)

6.1 76.2 17.8 4.0 74.5 21.6 5.5 70.1 24.4 5.6 72.2 22.2

0.2 86.6 13.2 0.3 86.1 13.6 0.3 85.8 13.9 0.0 89.0 11.0

8.5 85.4 6.1 9.6 79.9 10.5 9.6 79.4 11.0 10.8 82.6 6.6

0.0 95.1 4.9 0.1 96.0 3.9 0.3 94.7 5.0 0.0 96.0 4.0

Discussion In this paper, mode differences of data collection on drinking behaviour and alcohol-related problems were assessed. Differences in responses between self-administered questionnaires and telephone interviews were found to be in the expected direction, but were larger than reported in related survey research (Aquilino, 1994; Gmel, 2000). Mode differences in abstinence rates indicate that fewer respondents admit alcohol consumption if questioned by telephone compared to self-reports from questionnaires. Differences in the proportion of non-hazardous and hazardous drinkers lead to the question of whether the lower prevalence of hazardous drinkers in the telephone interview is simply a result of the higher abstinence rate or whether beverage-speciŽ c drinking patterns of drinkers differ between the modes. While the prevalence rates of beer, wine and spirits consumption among drinkers were signiŽ cantly lower in the telephone survey compared to the mail survey, the main mode effect was no longer signiŽ cant if beverage-speciŽ c frequencies and quantities per drinking day were compared. Thus, the Ž ndings indicate different response patterns in that drinkers appear to admit the consumption of more than one beverage less frequently in telephone interviews. Only 13.7% of past month drinkers reported the consumption of all three beverages in the present telephone survey compared to 43.2% of those in the mail survey. Since overall quantity of alcohol consumption is an additive function of beverage-

speciŽ c quantities, under-reporting of each beverage type results in lower quantities of overall drinking and thus in lower prevalence rates of hazardous drinking. The signiŽ cant “mode by gender” interaction with regard to all measures on beer and spirits consumption reveals a larger in uence of the assessment mode on males compared to females. The larger gender differences for quantities per drinking day in the mail survey compared to the telephone survey, however, may result partly from differences between the survey questions on beer and spirits: in the mail survey 3.6% of male beer drinkers and 2.8% of male spirits drinkers reported the consumption of both small and big glasses compared to 1.9% of female beer drinkers and 2.1% of female spirits drinkers. In the telephone survey respondents had to decide on the size of the glasses. However, these Ž ndings may be biased by differences in non-response between the two survey modes. According to Groves (1989), non-response bias is a function of response rate and the difference between respondents and non-respondents with respect to relevant outcome measures. Follow-up of non-respondents cannot support the common expectation that non-response will seriously affect survey estimates (Lemmens, Tan & Knibbe, 1988; Caspar, 1992). In most of these studies differences found in self-reported alcohol consumption measures were non-signiŽ cant. In a recent study on the effect of non-response in Switzerland Gmel (2000) found slightly but not signiŽ cant higher

Comparing responses from questionnaires and interviews mean alcohol consumption in non-respondents and concluded that non-response rates affect survey estimates less than mode of administration. Item non-response also appears to be correlated with mode of administration. A high proportion of missing values and inconsistencies was encountered in the self-administered questionnaire while telephone interviewing produced almost no missing data and only few inconsistencies. This is not surprising, given that telephone interviews are highly structured and guided by a professional interviewer. However, missing values cannot account for the lower estimates in the telephone survey, since its missing value rate is lower than in the self-administered version (see Tables 1 and 5). Investigation into patterns of missing values in the self-administered questionnaire revealed that among current drinkers (past 12 months) a substantial proportion of missing data found in the CAGE items was produced by those respondents who reported past-month abstinence (between 17% and 40%, depending on age and gender). Only a small proportion of missing values was related to heavy drinking. This Ž nding indicates that item non-response and inconsistencies in self-administered questionnaires are caused by carelessness or overlooking of certain items rather than by deliberate refusal. Since there is a growing tendency toward use of mixed-mode surveys, the question of the in uence of non-telephone households emerges. Although over 95% of Western German households had telephones, the proportion of households without telephone varied with household size. In the mail survey 6% of households reported to have no telephone and among drinkers this proportion was 5%. Comparisons of heavy drinkers and CAGE scores (two or more positive) showed higher rates among respondents without telephones both for males and females. Given the low proportion of households without telephones, however, impact on the full sample is rather negligible. The present results agree strongly with recent Ž ndings showing differences between self-administered and telephone modes for various alcohol measures (Gmel, 2000). They are also in line with research reporting differences on alcoholand drug-related behaviour in favour of selfcompleted questionnaires within face-to-face interviews, indicating that admission of

469

alcohol-related behaviour is facilitated by selfadministration rather than by the interviewer respondent interaction (Aquilino & LoSciuto, 1990; Aquilino, 1992; 1994; Gfroerer & Hughes, 1992). Rehm & Arminger (1996) found signiŽ cantly higher responses to drinking in personal interviews compared to self-administered questionnaires and offered the hypothesis that since drinking is a social event, the “social event” of an interview should also produce more reporting of alcohol-related behaviour. Bongers & van Oers (1998), however, could not Ž nd support for this alternative hypothesis in their recent study, where results pointed towards higher rates on alcohol-related behaviour in the self-administration mode. In most of the literature on mode differences, researchers emphasize the social desirability hypothesis with the notion that absence of interviewers facilitates response willingness. However, as pointed out by Dillman and colleagues (1996), other mechanisms (e.g. time pressure, memory limitations) are expected to have an effect on answers in the telephone format that are not expected to occur in the selfadministration format. The research literature is also consistent about the inter-relationship between socio-demographic variables such as sex, age, and income status and survey response (Goyder, 1987; Groves, 1989). Unravelling these mechanisms, their interactions and the impact of socio-demographic traits requires further research, especially since the question of interest has shifted from that of comparability to that of whether and how different modes of administration can be combined (e.g. Tarnai & Dillman, 1992; Rehm & Arminger, 1996). It has to be kept in mind that since survey estimates underestimate sales data by between 40 and 60% the mode yielding higher estimates is generally considered the more valid one. Following the common “the more, the better” argument, evidence has been collected in support of the self-administration mode in survey research. It appears that beverage-speciŽ c drinking and alcohol-related problems are more likely reported in situations where the respondent is not interacting with an interviewer, where he or she has control over the context as well as over pace of responding, and where information is taken in visually without time pressure. Although personal and telephone interview modes may have the advantage of higher overall response rates

470

Ludwig Kraus & Rita Augustin

and less item non-responses, self-administered questionnaires seem to be less affected by social desirability and interviewer effects.

Acknowledgements This research was supported by the German Federal Ministry of Health which also funded all waves of the German National Survey on Psychoactive Substances (NSPS) a repeated crosssectional survey, used in the present analysis.

References

AGRESTI, A. (1990) Categorical Data Analysis (New York, John Wiley and Sons, Inc.). AQUILINO , W. S. & LOSCIUTO , L. A. (1990) Effects of interview mode on self-reported drug use, Public Opinion Quarterly, 54, 362–395. AQUILINO , W. S. (1994) Interview mode effects in surveys of drug and alcohol use: a Ž eld experiment, Public Opinion Quarterly, 58, 210–240. AQUILINO , W. S. (1992) Telephone versus face-to-face interviewing for household drug use surveys, International Journal of the Addictions, 27, 71–91. BONGERS, I. M. B. & VAN OERS, J. A. M. (1998) Mode effects on self-reported alcohol use and problem drinking: mail questionnaires and personal interviewing compared, Journal of Studies on Alcohol, 59, 280–285. CASPAR, R. (1992) Follow-up of nonrespondents in 1990, in: TURNER, C.F., LESSLER, J.T. & GFROERER, J.C. (Eds) Survey Measurement of Drug Use, Methodological studies, DHHS Pub. no. (ADM) 92–1929, pp. 155–173 (Washington DC, US Government Printing OfŽ ce). DE LEEUW, E. D. & COLLINS, M. (1997) Data collection methods and survey quality: an overview, in: LYBERG, L., BIEMER, P., COLLINS, M., DE LEEUW, E., DIPPO, C., SCHWARZ, N. & TREWIN, D. (Eds) Survey Measurement and Process Quality, pp. 199–220 (New York, John Wiley and Sons, Inc.). DE LEEUW, E. D. & VAN DER ZOUWEN, J. (1988) Data quality in telephone and face-to-face surveys: a comparative analysis, in: GROVES, R.M., BIEMER, P.P., LYBERG, L.E., MASSEY, J.T., NICHOLLS II, W.L. & WAKSBERG, J. (Eds) Telephone Survey Methodology, pp. 283–299 (New York, John Wiley and Sons, Inc.). DE LEEUW, E. D. (1992) Data Quality in Mail, Telephone, and Face to Face Surveys (Amsterdam, TTPublikaties). DILLMAN, D. A. & TARNAI, J. (1991) Mode effects of cognitively designed recall questions: a comparison of answers to telephone and mail surveys, in: BIEMER, P.P., GROVES, R.M., LYBERG, L.E., MATHIOWETZ, N.H. & SUDMAN, S. (Eds) Measurement Errors in Surveys, pp. 73–93 (New York, John Wiley and sons, Inc.). DILLMAN, D. A. (1978) Mail and Telephone Surveys (New York, John Wiley and Sons, Inc.).

DILLMAN, D. A., SANGSTER, R.L., TARNAI, J. & ROCKWOOD , T. H. (1996) Understanding differences in people’s answers to telephone and mail surveys, in: BRAVERMANN, M. & SLATER, J. (Eds) Advances in Survey Research, pp. 45–61 (San Francisco, JosseyBass Publishers). EDWARDS, G., ANDERSON, P., BABOR , T. F., CASSWELL, S., FERRENCE, R., GIESBRECHT, N., GODFREY, C., HOLDER, H. D., LEMMENS, P., MA¨KELA¨, K., MI¨ STERBERG, E., ¨ M , T., O DANIK, L. T., NORSTRO ROMELSJO¨, A., ROOM , R., SIMPURA, J. & SKOG , O. J. (1994) Alcohol Policy and the Public Good (Oxford, Oxford University Press). EWING, J. (1984) Detecting alcoholism: the CAGE questionnaire, Journal of the American Medical Association, 252, 1905–1907. FREY, J. H. (1989) Survey Research by Telephone (London, Sage Publication). GFROERER, J. C. & HUGHES, A. L. (1992) Collecting data on illicit drug use by phone, in: TURNER, C.F., LESSLER, J.T. & GFROERER, J.C. (Eds) Survey Measurement of Drug Use. Methodological studies, DHHS Pub. No. (ADM) 92–1929, pp. 277– 295 (Washington DC, US Government Printing OfŽ ce). GMEL, G. (2000) The effect of mode of data collection and of non-response on reported alcohol consumption: a split-sample study in Switzerland, Addiction, 95, 123–134. GOYDER, J. (1987) The Silent Minority (Cambridge, Blackwell). GROVES, R. M. & KAHN, R. L. (1979) Surveys by Telephone: a national comparison with personal interviews (New York, Academic Press). GROVES, R. M. (1989) Survey Errors and Survey Costs (New York, John Wiley and Sons, Inc.). GROVES, R. M. (1990) Theories and methods of telephone surveys, Annual Review of Sociology, 16, 221– 240. HOCHSTIM, J. R. (1967) A critical comparison of three strategies of collecting data from households, Journal of the American Statistical Association, 62, 976–989. HOX , J. J. & DE LEEUW, E. D. (1994) A comparison of non-response in mail, telephone, and face-to-face surveys, Quality and Quantity, 28, 329–344. JOHNSON, P. T., HOUGLAND, J. G. & CLAYTON, R. R. (1989) Obtaining reports of sensitive behavior: a comparison of substance use reports from telephone and face-to-face interviews, Social Science Quarterly, 70, 174–183. KRAUS, L. & BAUERNFEIND, R. (1998) Repra¨sentativerhebung zum Konsum psychotroper Substanzen bei Erwachsenen in Deutschland 1997 [Population survey on the consumption of psychoactive substances in the German adult population], Sucht, 44 (Sonderheft 1), S3–S82. KRAUS, L., BAUERNFEIND, R. & BU¨ HRINGER, G. (1998) Epidemiologie des Drogenkonsums in Deutschland. Ergebnisse aus Bevo¨lkerungssurveys 1990 bis 1996 [Epidemiology of Drug Consumption in Germany: results from population surveys 1990–1996)] (Baden–Baden, Nomos). LEMMENS, P. H. H. M., TAN, E. S. & KNIBBE , R. H. (1988) Bias due to non-response in a Dutch survey

Comparing responses from questionnaires and interviews on alcohol consumption, British Journal of Addiction, 83, 1069–1077. MANGIONE, T. W., HINGSON, R. & BARRETT, J. (1982) Collecting sensitive data: a comparison of three survey strategies, Sociological Methods and Research, 10, 337–346. MARHENKE, W. (1997) Telefonanschlussdaten als Auswahlgrundlage [Sampling based on telephone directories], in: GABLER, S., HOFFMEYER-ZLOTNIK, J.H.P. (Eds) Stichproben in der Umfragepraxis [The Practice of Survey Sampling], pp. 207–220 (Opladen, Westdeutscher Verlag). MAYFIELD, D., MCLEOD, G. & HALL, P. (1974) The CAGE questionnaire: validation of a new alcoholism screening instrument, American Journal of Psychiatry, 131, 1121–1123. REHM, J. & ARMINGER, G. (1996) Alcohol consumption in Switzerland 1987–93: adjusting for differential effects of assessment techniques in the analysis of trends, Addiction, 91, 1335–1344. REHM, J. & SPUHLER, T. (1993) Measurement error in alcohol consumption: the Swiss Health Survey, European Journal of Clinical Nutrition, 47 (suppl. 2), 25–30. REHM, J. (1994) Reliabilita¨t und Stabilita¨t des Indikators fu¨r Alkoholkonsum in der Schweizerischen Gesundheitsbefragung [Reliability and Stability of the indicator for alcohol consumption in the Swiss Health Survey], Drogalkohol, 18, 3–8.

471

SAUNDERS, J. B., AASLAND, O. G., BABOR , T. F., DELA FUENTE, J. R. & GRANT, M. (1993) Development of the Alcohol Use Disorders IdentiŽ cation Test (AUDIT): WHO collaborative project on early detection of persons with harmful alcohol consumption, Addiction, 88, 791–804. SCHWARZ, N., STRACK, F., HIPPLER, H.-J. & BISHOP, G. (1991) The impact of administration mode on response effects in survey measurement, Applied Cognitive Psychology, 5, 193–212. SHAH, B., BARNWELL, B. & BIELER, G. (1997) SUDAAN User’s Manual, release 7.5 (Research Triangle Park NC, Research Triangle Institute). STATISTISCHES BUNDESAMT (Ed.) (1997) Statistisches Jahrbuch 1997 [Annual Statistics 1997] (Stuttgart, Metzler-Poeschel). SUDMAN, S. & BRADBURN, N. (1974) Response Effects in Surveys: a review and synthesis (Chicago, Aldine). SYKES, W. & COLLINS, M. (1988) Effects of mode of interview: experiments in the UK, in: GROVES, R.M., BIEMER, P.P., LYBERG, L.E., MASSEY, J.T., NICHOLLS II, W.L. & W AKSBERG, J. (Eds) Telephone Survey Methodology, pp. 301–320 (New York, John Wiley and Sons, Inc.). TARNAI, J. & DILLMAN, D. A. (1992) Questionnaire context as a source of response differences in mail and telephone surveys, in: SCHWARZ, N. & SUDMAN, S. (Eds) Context Effects in Social and Psychological Research, pp. 115–129 (New York, Springer).