Exposure Measurement Error: Influence on ... - Multiple Choices

23 downloads 0 Views 649KB Size Report
must be ascertained by subjects' self reports or from records of dubious quality ... subjects' asbestos exposure histories) or the misc1assification of the exposure.
ANNUAL REVIEWS

Further

Quick links to online content Annu. Rev. Pub!. Health Copyright ©

1993

1993. 14:69-93

by Annual Reviews Inc. All rights reserved

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

EXPOSURE MEASUREMENT ERROR: Influence on Exposure-Disease Relationships and Methods of Correction Duncan Thomas, Daniel Stram, and James Dwyer Department of Preventive Medicine, University of Southern California, Los Angeles, California 90033-9987 KEY WORDS:

exposure measurement, dose-response, biostatistics, diet, radiation

INTRODUCTION Epidemiology, being primarily an observational rather than experimental science, has often encountered problems with measurement of the study variables, which are not under the control of the investigator. Difficulties arise in part because the variables under study are often subjective and often must be ascertained by subjects' self reports or from records of dubious quality, because biological variability and laboratory error can occur, and because studies are often conducted retrospectively, thus requiring informa­ tion on events that occurred long ago. Epidemiologists have long recognized that measurement errors have been among the major weaknesses of their studies and have gone to great lengths to try to assess the magnitude of such errors and their likely impact on the conclusions . This problem has spurred much methodologic research, initially on understanding the effects of measurement errors on exposure-response relationships and more recently on developing methods to correct for such errors. The results of the former line of research have often been used qualitatively in the interpretation of epidemiologic findings, but the quantitative predictions have seldom been applied. Thus, the latter line of research appears to be particularly promising in terms of practical applications . Because most of this work is relatively 69

0163-7525/93/0510-0069$2.00

70

THOMAS, STRAM & DWYER

recent, there have been few successful applications to date, but we hope that this review stimulates an interest in the wider application of these methods .

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

Terminology and Notation The terms "measurement error," "misc1assification," and "errors in variables" refer to any discrepancy between the true value of a variable x and its measured value z, although misc1assification is more often used with categorical variables, and measurement error with continuous variables. Most of the epidemiologic literature on this topic has been concerned with a binary disease outcome d or censored survival times (t,d), although many of the same principles apply to continuous outcomes . Except where it is necessary to distinguish a particular type of outcome, we denote the dependent variable as y . Errors can b e "systematic" or "random." Systematic errors are those for which the measured values might not be distributed randomly around the true value; for example, coding a variable so that all exposed persons were classified as unexposed and vice versa, or underestimating everyone's expo­ sure level by a factor of two. The effects of such errors on exposure-response relationships are usually easy to predict. In the former case, the estimated odds ratio (OR) would be the inverse of the true OR; in the latter case, the slope of the exposure-response relationship would be double the true slope. Random errors pose a more difficult problem. Systematic and random errors can be either "differential" or "nondifferen­ tial," depending upon whether the errors in that variable are related to y. If the misc1assification of the outcome depends upon exposure status (e . g . because the pathologist confirming the diagnosis of mesothelioma was aware of the subjects ' asbestos exposure histories) or the misc1assification of the exposure depends upon the disease status (e.g. because of recall bias in a case-control study), then the errors are differential . Such errors can introduce serious and sometimes unpredicatable bias into a study . But, they can often be avoided by good study design, e.g. by blinded assessment of the study variables. How­ ever, even nondifferential random errors can bias associations, and this is the primary focus of this paper. Loosely speaking, exposure errors are nondifferential if their distribution is independent of the outcome, i . e . Pr(z- x l y 1) Pr(z- x l y 0), but this is not a very rigorous definition . If a case and a noncase have the same value of z, the case will generally tend to have had a larger value of x because x, not z, is the true risk factor; thus, Pr(x l z,y 1) =1= Pr(x l z,y 0) . A more precise 1 ) = Pr(zi x,y = 0), e . g . definition of nondifferential error is that Pr(z l x,y no recall bias. This is formally equivalent to requiring that Pr(Yi x,z) PrCYi x), i . e . that the risk of disease depends only on the true exposure x, and given x, the measured exposure does not add any additional information. In most studies, it is helpful to regard the measurement error problem as =

=

=

=

=

=

=

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

EXPOSURE MEASUREMENT ERROR

71

consisting of three conceptually distinct modeling tasks (6), which consist of the specification of the disease model-Pr(yj x); the measurement model­ Pr(z l x); and the distribution of true exposure-Pr(x) . For such probabilities as Pr(x) to have meaning, we have taken the "structural" view of the measure­ ment error problem, in which the unknown x for each individual is regarded as a random variable having a distribution Pr(x) among the population of interest. Thus, the characterization of Pr(x) , rather than the direct estimation of each x, is dealt with in the course of analysis. This contrasts with the "functional" approach to measurement errors, in which each subject's value of x is treated as a fixed but unknown parameter to be estimated jointly with the parameters in the disease model. The structural approach is generally more amenable to analysis than the functional, because it yields a large reduction in the number of nuisance parameters to be estimated: Instead of estimating each individual's x, only a few parameters in a prespecified form of Pr(x) are needed. Of course, uncertainties about the proper form of Pr(x) are rarely, if ever, fully resolvable .

Overview of the Effects of Measurement Error In the next sections, we discuss the following effects of measurement error in a variety of settings and describe some currently available methods for correcting for these effects in epidemiological studies:

1. Changes in the observed mean structure, E(yj z) compared with E(yj x) . This generally means that there is an attenuation of the observed compared with the true exposure-response relationship. 2. Changes in the observed variance structure, V(yj z) from that of V(yj x). In general, y is more variable ("overdispersed") given z than given x. 3 . Distortion of associations and interactions between covariates and out­ comes. For example, two outcomes that are independent given x may appear to be associated given z, and the main effect or interaction of a covariate w can be distorted by measurement error in x. Classical and Berkson Error Models When E(yj x) is linear in x, a generally applicable approach to providing an unbiased assessment of the regression parameters is to replace the unknown xs in the disease model with their expectations E(x l z) . Thus, in many problems, the primary goal of the statistician is to learn enough about the measurement model Pr(z l x) and the true exposure distribution Pr(x) that this expectation can be calculated from E(x

I z)

=

Jx Pr(z I x) Pr(x) dx f pr(z I x) Pr(x) dx

-"--::--,-!-----

72

THOMAS, STRAM & DWYER

In fact, the calculation of such expected values of "true given observed" can be said to form the basis of modem approaches to the measurement error problem, beginning with a classic paper by Cochran

(7).

Some of the statistical literature distinguishes between measurement mod­

els in which Pr(zl x) can be specified and situations in which it may be more

appropriate to specify Pr(x l z)

(1 ). The classic idealization of the latter situa­

tion is in an experiment that uses a machine for delivering doses x, which are

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

randomly distributed around the "dial-setting" z on the machine, the ex­ perimenter being free to vary the dial. Such an idealized experiment is often called a "Berkson error" model =

(3) if the machine has the property that E(x l z)

z. Then, if the disease model is also linear in x, no correction for errors in

the measured variable is required. In observational epidemiology, exposure measurements sometimes also have this Berkson property, particularly when individuals are classified into groups and all members of group ware assigned a value z that is the mean of x within the group. An example might be an occupational study in which workers with the same job title ware assigned an exposure z based on the mean of area measurements; the true exposures

x for

individuals with the same job title would differ, but on average would tend to equal the area mean. In this type of study, linear regression of y on z would again lead to an unbiased estimate of the slope parameter if the true dose­ response was linear in x. On the other hand, in the "classical error" model, the observed value z is assumed to be distributed around the true value

x with E(zlx) = x.

As shown

below, this model leads to the familiar bias toward the null. In general, the two relations

E(xlz)

=

z and

E(zlx)

=

x cannot both be satisfied except in the

trivial situation when there is no measurement error. Although the unbiased­ ness of the Berkson model is appealing, the model has limited use for several reasons. First, this property applies only in linear disease models. Second, the other features of the measurement error problem---changes in

V(Ylz), artifac­

tual relationships between multiple outcomes given z, and the residual con­ founding of covariates given z-all occur for Berkson, as well as classical, measurement error. Finally, in any observational setting,

E(xlz)

always de­

pends upon �he underlying exposure distribution Pr(x); for example, research­ ers studying different occupational populations with different distributions of true exposure, but using the same system of job categories, will make different assignments of z to the same job categories, because each will have to average over different underlying distributions of x.

Assessment of Measurement Error Distributions To put any of these predictions to practical use, an investigator must have an estimate of the distribution of measurement errors. To the extent that sub­ groups differ in their error distributions, being able to quantify these dif-

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

EXPOSURE MEASUREMENT ERROR

73

ferences is also helpful. Estimates of error distributions can come from several types of studies, including "validation" and "reproducibility" studies and "pathway uncertainty analysis." Hatch & Thomas (27) provide an exten­ sive discussion of measurement error assessment methods, which we briefly summarize here. In validation studies, one compares a "gold standard" measurement of x against the flawed measurement z to be used in the main study. This provides a direct estimate of the error distribution, but because the gold standard method is generally prohibitively expensive, it can only be done on a sub­ sample. For example, personal monitors are the gold standard in an occupa­ tional study, whereas a room air sampler is the exposure method used. The results of validation studies of categorical exposure variables are easily summarized in terms of the misc1assification probabilities Pr(zi x), which can be used directly in the methods of adjustment described below. For con­ tinuous variables, results can be summarized in terms of the variance of the deviations z-x or zlx, depending on the form of the distribution, possibly broken down by important study variables. If a gold standard is unavailable or infeasible, one can still often obtain indirect information on error distributions by means of a reproducibility study, in which two or more separate assessments of z are performed on the same individuals, e.g. two or more diet assessments at different times. For example, if z\ and Z2 are independent given x, Zn - N(x,ui) and x - N(j.t, cr), then cr can be estimated by COV(ZJ,Z2) and ui by V(z) &2. For categorical variables, let m be the matrix of misclassification probabilities, mij Pr(z Jl x 1), P be a diagonal matrix of population exposure probabilities [Pii Pr(x 1), 0 elsewhere], and Z be the observed matrix of reproducibility data, 4k = Pr(z\ = j,Z2 = k). Again, if z\ and Z2 are independent given x, then Z = m Tpm and m and p can be estimated as the solution to this matrix equation. Key to both results is the conditional in­ dependence assumption. If Zl and Z2 represent responses to the same question on surveys conducted on consecutive days, then a subject's response on the second day is likely to be influenced by his recollection of the previous day's response, in addition to his true value of x. Because the observed correlation between z\ and Z2 is thus a combination of the correlation induced by the common Pr(zi x) and the partial correlation in Pr(zl,z2Ix), at most one can say that this provides a means to estimate a lower bound on the misclassification probabilities. In some studies, exposure estimates are derived by using pathway analysis in which the doses are computed over multiple pathways, each of which may consist of several steps, and each step is subject to various uncertainties. The final dose assignment is then the expectation of this sum over the joint distribution of all the component uncertainties, and a variance of the dose -

=

=

=

=

=

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

74

THOMAS, STRAM & DWYER

assignment can be similarly computed as the variance over all the component uncertainties, For example, in the studies of cancer in residents downwind of the Nevada Test Site (53), each subject's dose from various radionuclides was computed by summing contributions through external, inhalation, milk, and vegetable pathways from more than 100 detonations. The milk pathway , for example, involved consideration of the yield of each radionuclide from each test, radioactive decay rates, environmental transport and deposition, uptake by vegetation, farming practices , intake by cows and goats, transfer to milk, milk distribution patterns, the family's source of milk (dairy or backyard cow), the subject's (or mother's) consumption, breastfeeding rates and transfer through the mother's breast milk, and finally metabolic pro­ cesses to the thyroid gland. Each step of this process was characterized by probability distributions, based on available data, expert judgment, the sub­ ject's own report, and reliability surveys. Because of the complexity of the dose assignment algorithm, analytic calculation of the expectation and vari­ ance of the dose was not feasible, so a Monte Carlo simulation was con­ ducted, which generated 100 dose estimates for each subject by using ran­ domly chosen values of the various parameters and then summarizing the distributions in terms of their means and variances. The unique feature of this approach is that it provides not only a dose estimate for each subject, but also a separate uncertainty estimate for each individual's dose. These in­ dividualized uncertainties could then be used to correct dose-response relationships for measurement errors in a way that essentially gives greater weight to the subjects with more precisely estimated doses , as described below.

EFFECTS OF EXPOSURE MEASUREMENT ERRORS Effects on Mean Structures The effects of exposure errors are most easily seen for categorical exposure variables, and the early epidemio­ logic literature was mainly concerned with this situation. The essential points can be made in the case of binary exposure variables x,z = 0, I with nondifferential misclassification. Hence, following the model specification format outlined above , let CATEGORICAL EXPOSURE VARIABLES

m:

=

Pr(x Pr(y mij Pr(z

P: Pi r: ri

=

=

=

=

where LiPi are

=

i),

llx = i), = jlx = i),

true prevalence of exposure risk of disease in true exposure group i misclassification probabilities

1, and Ljmij = 1. Then the observed disease risks classified by z

EXPOSURE MEASUREMENT ERROR Rj

=

Prey

= 11 z

= � Pr(y =

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

i

=

l lx

75

j)

= i)

Pr(x

= i l z = j)

where Mji = mijp;lPj and Pj = Pr(z = j) = '2:-imijPi. Simple algebra shows that any measure of epidemiologic effect (e .g. RR, RD , OR) expressed in terms of the Rjs will be biased toward the null compared with the corresponding measure expressed in terms of the ris (8, 20). Intuitively, this arises because each of the measured exposure groups is contaminated by individuals from the other group. For analogous results in matched studies, see Greenland (22). However, this argument applies only when individuals are classified in terms of their own exposures, z. In ecologic studies, groups are classified in terms of some other characteristic w, such as place of residence, and the observed disease rates E(Ylw) are regressed on the group mean exposures E(xlw). It is easy to see that this regression yields unbiased estimates for linear disease models when exposure is not misclassified. It is also true, but less obvious, that regression on measured group means E(zlw) will yield unbiased estimates under a classical error model. if E(zlx) = x and the measurement error distribution is unaffected by w [i. e. Pr(zlx, w) = Pr(zlx)], then E(zlx, w) = E(xlw). Consequently, if E(Ylx) = aD +alx, then E(Ylw) = aD + a1E(xlW) = aD + a1E(zlw). Thus, even though classical error produces a bias toward the null in individual studies, this does not necessarily happen in ecologic studies. In the B erkson error case, a similar argument shows that regression of E(Ylw) on E(zlw) is also unbiased, provided one can assume that E(xlz,w) = z. Brenner et al (4) pointed out that if the exposure variable is binary, then neither a classical nor a Berkson error structure can apply, because neither E(zlx) nor E(x l z) can equal 0 or 1, the possible values for individuals. Furthermore, it can be shown that the group means E(zlw) will show less variability than E(xlw). Because the group disease rates are not affected by how the group exposure means are characterized , the resulting slopc estimate will be inflatcd if E(zlw) is used on the x-axis instead of E(xlw). Returning to analytic studies, the situation is more complex with multilevel categorical exposure variables . Dosemeci et al ( 1 1) showed that if three or more groups follow a gradient in disease risk by true exposure, then after nondifferential misclassification , the ORs by measured exposure will not necessarily show a monotonic relationship. This happens because the OR comparing exposure groups 1 and 2, for example, may be corrupted by subjects from group 3; if group I is small compared with group 2, the influx of high-risk group 3 subjects will have a proportionally bigger effect on group 1 than on group 2, thus making it appear at higher risk than group 2. The

THOMAS, STRAM & DWYER

76

authors further showed that tests of trend can be reversed. In a subsequent paper (55) , they showed as a corollary that by collapsing a nondifferentially misclassified 2 X K table, the resulting 2 X 2 table can show differential misclassification, with ORs biased away from or beyond the null. a + If E(y\x) is linear in x, E(y\x) {3x, then the observable exposure-response relationship E(y\z) can be derived by taking expectations over x, so that E(y\z) = a + f3E(x\z). Consider the special case in which both the population exposure and measurement error models are normal, x N(p.., if) and z\x N(x,w2). Then, the expectation becomes

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

CONTINUOUS EXPOSURE VARIABLES

=

-

zlw2 lIw2

E(x I z)

+ +

= cz + (1



ILl(1'2 1/(1'2 -

C)IL

1. la.

so that E(x\z) i s a weighted average of the measured exposure and the overall mean. Because this is a linear function of z, the observed exposure-response relationship will still be linear, but its slope will be biased toward the null:

E(y\z) = =

a a

'

+ +

f3(1 - c)p.. f3' z

+

f3cz 2.

where f3' f3c. The intuitive explanation of this effect is that measurement error produces overdispersion of the exposure axis because V(z) = Vex) + V(z\x). By stretching the exposure axis while leaving the disease axis un­ changed, the slope of the relationship is reduced. However, this is true only in the special case of an additive model with normal errors, normal x, and a linear relationship for E(y\x). More generally, the shape of the exposure­ relationship will also be affected by measurement errors. For example, for linear-quadratic models, E(y\x) = a + f3lx + f3 x2, the mean structure for y\z 2 becomes E(y\x) = a + f3IE(x\z) + f3 E(�\z). Thus, the linear-quadratic form 2 of relationship is preserved , but the variables in the regression become the conditional moments E(xi\z). Berkson error models illustrate the same proper­ ties with the slight simplification that E(x\z) = z and E(x2 \z) z2 + V(x\z). =

=

Effects on Variance Structures Consider a response variable

y

=

a

+

y

following a statistical model of the form 3.

f3x + error.

For a subject with estimated exposure

z,

Equation 3 can b e reexpressed as

EXPOSURE MEASUREMENT ERROR y

=

a

+

a

+

,BE(xlz) + ,BE(xlz) +

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

where V(error*)

=

,B[x - E(xlz)] + error error* ,B2V(xlz) + V(error) .

77 3a.

Thus , for example, if ylx is a Poisson random variable with conditional mean ' IL = E(Ylx) = a + ,Bx, then the conditional variance of ylz is v(Ylz) = IL' + 2 13 V(xlz) . Thus, ylz takes the form of an overdispersed Poisson random variable with variance larger than a true Poisson for which V(y) = E(y). Maximum likelihood techniques consist of iteratively reweighted least squares that use weights w = lIv(Ylx). Thus, efficient estimation of the parameters of the disease model requires us to estimate V(xlz) to compute these weights . Note that, in the important case of binomial proportions, with mean p and denominator n, Equation 3a needs to be modified so that V(error*)

= [n/(n - 1)],B2V(xlz)

+ V

(error)

where V(error) p(l - p)/n. The modification is needed because, in this case, the binomial error in y is not independent of x - E(xlz). The implication is most obvious for binary outcomes (i.e. n = 1) where measurement error changes the mean, but does not change the variance structure for y. For disease models that are not linear in x , the calculation of E(Ylz) and v(Ylz) becomes more complicated, although it is clear that ylz is generally more variable than ylx. In principle, v(Ylx) may be calculated from v(Ylz) = V[E(Ylx)] + E[V(Ylx)]. For models in which E(Ylx) are polynomial functions of x, both the mean and variance of ylz involve linear combinations of E(xilz) , and it is again computationally feasible to consider the estimation of the parameters in the mean function E(Ylx) as an application of iteratively re­ weighted least squares regression. More complex methods are required for the estimation of nonlinear disease models in the presence of exposure measure­ ment error, as we describe below. =

Induced Associations Between Outcomes, Residual Confounding, and Interactions Between Covariates One topic of interest in the study of the atomic bomb survivors has been whether the presence of early effects of radiation sickness among the sur­ vivors, such as loss of hair, bleeding of the mouth and gums, nausea, and vomiting, is predictive of an inherent radiation sensitivity to both early and late effects of radiation exposure. For example, Neriishi et al (38) found that survivors who reported a history of epilation showed a significant twofold higher risk of leukemia than those without epilation at the same estimated dose, which suggests a radiosensitivity shared by early and late effects . If the true doses were observed for each subject, then this would be a simple

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

78

THOMAS, STRAM & DWYER

inference to draw, but because only estimated doses are available, measure­ ment error bias must be considered. Suppose two outcome variables, YI and Y2 , each have a linear dose-response and are independent given true exposure. Then, the observed covariance between YI and Y2 given estimated dose z is Cov(YJ,Y2I z) = J31J32V(xlz) . The obvious result is that measurement error induces an artifactual association between response variables that would otherwise be seen to be independent if true exposure x was available. In a similar way, the joint effects of two risk factors x and w can be distorted by measurement errors in either or both of them. Such distortion can arise either because of the association between x and w or because the degree of measurement error in one variable, say x, depends upon the other variable . In the case of categorical exposures, Greenland (21) showed that mis­ classification of a confounder can lead to partial loss of control for that variable, e .g . if the crude OR is 4 and the OR adjusted for the true confounder is 2, then the OR adjusted for the measured confounder might be 3. Further­ more, he showed that misclassification of a modifier can mask or introduce spurious effect modification, e . g . if the true OR is 3 in both strata of the modifier, the misclassified data might show ORs of 2 and 4. Such apparent interactions can result from misclassification of either the exposure or the modifying variable . Fung & Howe (19) have examined similar issues in the case where both exposure and confounder were polytomous. A similar phenomenon was reported for continuous variables by Dwyer et al (13) in a reanalysis of data on coronary heart disease from the Framingham study. By using multivariate adjustment techniques described below, they found that the effect of age (not misclassified) was biased away from the null in the uncorrected analyses as a result of underestimation of the effects of serum cholesterol and blood pressure, which were subject to measurement error. Appropriate adjustment in the multivariate case must allow for the correlation in errors between exposure variables. Suppose instead that, in the study of the atomic bomb survivors , the precision of the radiation dose estimates depended upon age . Then the slope of the dose-response relationship would appear to be modified by age (spurious interaction) , and the magnitude of the age effect might be distorted (residual confounding) .

METHODS OF CORRECTING FOR MEASUREMENT ERRORS In this section, we consider a variety of methods that have been suggested for adjusting exposure-response relationships for measurement error and their relationships. After briefly reviewing methods for categorical exposures , we contrast several basic approaches for the continuous case:

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

EXPOSURE MEASUREMENT ERROR

79

1 . Correction of the parameter estimates from the naive regression of y on z for the anticipated degree of bias. 2. A two-stage approach in which Xi = E(Xilz;) (and similar quantities as needed) are computed first and then used in a simple regression of y on Xi' 3. Structural equations approaches in which the marginal relationships ylz can be fitted directly in certain restricted situations , such as where all variables are normal and linearly related. 4. Full l ikelihood treatment, integrating over the unknown xs . 5. Nonparametric specification of Pr(x) in approaches 2 and 4. 6. Use of additional information w to build a refined model for Pr(xl w). Categorical Exposure Variables Suppose we observe R where Rjk Prey klz j) and also have an independent estimate of the posterior classification probabilities M, Mji = Pr(x ilz J) defined earlier, and let r denote the true exposure-response probabilities rij = Prey = jlx = 0. Them, the observed exposure-response relationship can be written in terms of the matrix equation R = Mr, and Greenland & Kleinbaum (26) have shown that r can be reconstructed simply as r = M-IR. Thus , the basic idea is to replace the counts in the observed tabulation of y by z by the expectations of the counts in the tabulation of y by x, and then compute ORs or other measures of effect on this table of expected counts. This parallels at a grouped level the approach described below for continuous variables, where E(Ylz) is computed for each individual in terms of quantities like E(xlz). Greenland (24) further shows how valid interval es­ timates on effect measures can be derived by incorporating the uncertainties in the estimates of M. Now suppose instead that we have reproducibility data on all subjects, i.e. for each subject we have y, z), and Z2' A simple technique proposed by Marshall & Graham (35) is to limit the analysis to the subjects for whom ZI = Z2 and treat that value as x. This only partially eliminates misclassification bias, and there can be a substantial loss of power because of the smaller sample size. Maximum likelihood methods (14, 28) could also be used to fit the observed three-dimensional table to a model involving the OR for the association between x and y, the true exposure prevalence p, and the mis­ classification probabilities m, assuming that z 1 and Z2 are conditionally in­ dependent given x and subject to the same misclassification probabilities. A simple way to fit the same model is via the E-M algorithm (9), by using trial estimates of the model parameters to compute the expectations of the numbers of subjects in the four-way table with x as the fourth dimension, conditional on the observed three-way table, and then treating these as known to estimate the model parameters. B y using either fitting method, variance estimates are obtained from the Fisher information. =

=

=

=

=

80

THOMAS, STRAM & DWYER

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

Continuous Exposure Variables CORRECTION OF ATTENUATED REGRESSION COEFFICIENTS The above discussion of the effects of measurement errors on mean structures suggests a simple method of correction. Suppose E(Yix) is linear in x and both x and zix are normal . Then Equation 2 shows that E(Yiz) is also linear in z with expected slope coefficient given by {3' = {3c, where the attenuation factor c is given by Equation l a. Thus, an obvious correction method is simply to estimate {3' by regressing y on z and then estimate {3 as '/3' /e. B y the delta method, the variance of '/3 is then V(� ' )/c2 + ('/3' /c2)2 V(c). If both the population exposure variance if and the measurement error 2 variance w were known, c could be simply computed from Equation la, but in general c must be estimated from validation or reproducibility studies. For validation substudy data, Rosner et al (50) suggest simply regressing x on z, because Equation 1 shows that the coefficient of z in this regression is c. In expectation, this is equivalent to estimating rr = Vex) and oi V(z x) from the substudy data and computing c = rr/( rr + C;i), although this approach appears to produce somewhat smaller variance in '/3 than the regression-based approach. Neither approach uses the marginal variance of z in the main study or the relationship between y and z in the estimation of these variances. The structural equations and maximum likelihood approaches described below would therefore be expected to be more efficient. Note also that the delta­ method variance estimate given above assumes that the validation study and main studies are independent. If the validation study is a subset of the main study, the variance estimate can be easily corrected to allow for this depen­ dence, but the full structural equations or maximum likelihood treatment described below would make more efficient use of all the data. The Rosner et al approach is easily generalized to the multivariate case (48). If x = (x 1> , xp) is a vector of true exposures to be related to y and z is a corresponding vector of measured values, one then regresses x on z to obtain a matrix of coefficients C and then estimates '/3 = '/3 'e-I. The regression approach depends upon having direct observations of x available on at least a substudy . If instead two independent and replicate measurements Zl and Z2 were available on all subjects, one would estimate if from COV(Z"Z2) and w2 from V(z) rr, and then proceed as described above. =





-



-

REPLACEMENT OF Z BY E(xiz) In linear/normal models, Rosner et aI's approach is formally equivalent to regressing y on x = E(xiz) = Co + Cz, where c is obtained from the regression of x on z in the substudy data. The beauty of this approach is that it is easily generalized to more complex models . For example, for polynomial disease models, one simply replaces x by E(xlz), x2 by E(�lz), etc . , where the various expectations can either be

EXPOSURE MEASUREMENT ERROR

81

computed analytically from knowledge of the underlying distributions of x and zlx or estimated empirically from regressions of x" on z. In the multi­ variate case, the corresponding generalization of Equation 1 is

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

4. where n is the covariance matrix of the measurement errors, and I is the covariance matrix of the population distribution of true exposures. Of course, even for the simplest linear models, efficient estimation requires a weighted regression analysis, so that v(Ylz), which involves E(x2Iz) as well as E(xlz), is also needed to provide the regression weights . However, there are several special cases where the calculation of E(xlz) is all that is needed to perform an efficient analysis: the standard additive error model where y, x, and z are jointly multivariate normal; a binary outcome y with a mean that is linear in x; and survival pairs (t,d) where the hazard rate model is linear in x (40, 43). For other types of models, the key step in the correction for measurement error is the calculation of E(Ylz) and v(Ylz). In general, this is complicated because the functional form of E(Ylz) and v(Ylz) is inherently nonlinear. For example, in an exponential model, one would need en E[exp({3xn) l zn], which is a function of the current estimate of {3, as well as JL and cr. =

The approaches described above are somewhat ad hoc in that different pieces of information are used in separate estimations of the different parameters. Structural equations attempt to fit the model in a single stage, by estimating all the parameters simulta­ neously using all relevant information. The basic idea is to compute the "marginal model" that describes the joint distribution of all the observable variables as a function of the model parameters, after integrating over the unknown xs. This is generally possible only for certain restricted classes of models, the most important of which is where all variables are normally distributed and linearly related to each other. To illustrate the approach, consider again the situation in which two measurements Zl and Z2, together with y, are available on all subjects. Suppose the conditional relationships were given by the general set of equa­ tions

STRUCTURAL EQUATIONS APPROACHES

5.

y

= a+{3x+e Zj = 'Yj + AjX + T/j

for j

=

1 ,2.

6.

x discussed above is a special case of this The classical error model E(zlx) model where "6 = 0, Aj 1,E(7lj) 0, Cov(x,7J) = 0, and Cov(x,e) = 0. Obviously, some constraints would be needed to identify the parameters in the general model 5 and 6 uniquely. One such set of assumptions is that all =

=

=

82

THOMAS, STRAM & DWYER

covariances involving the measurement errors 'YJ j are zero, and that AI = A2 1. By including Y in the model, one can show that the assumption A2 = I can be relaxed. Assumptions can be further relaxed in the case where more than two measurements Z or more than one x are available, thus allowing estima­ tion of covariances among measurement errors (12). With these constraints, the corresponding marginal model would then be given by Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

=

V(Zj) = A3V(x) + V( 'YJ) COV(Zlh) = A1A2V(x) V(y)

=

f32 V(x) + VeE) = f3AP(x).

Cov(y,z)

The model is then fitted by finding the values of the parameters for which the predicted covariance matrix � of the observed data are closest to their observed values S. This is generally done by maximizing the multivariate normal likelihood. L

=

In[det(i)]-ln[det(S)]

+

tr(st-I) - p.

7.

When different subsets of variables are observed in different samples, it is desirable to estimate separately those parameters that are identifiable in each sample. The covariance matrices from each sample can then be pooled in the ML estimation, thus constraining parameters to be equal across the multiple samples . For example, if x and Z are measured in a validation sample and Y and Z in the main sample, all the covariance information from both the validation and main samples can be used for estimation. Fitting these multiple population models can be accomplished with available software in the con­ tinuous (31) and categorical (36) cases. FULL LIKELIHOOD APPROACHES More generally, estimation approaches may be based on maximizing a likelihood obtained by integrating out the unknown xs, N

L(f3)

=

II

n=1

f L(Yn l xn; f3) Pr(znlxn) Pr(xn) dxn

8.

where L(Ylx) is the usual likelihood contribution if the xs were known. This expression is maximized by setting the score statistic, Sz(f3) alnLlaf3 to zero and solving for (3. The score statistic generally has the form Sz«(3) Exl y,�x«(3) where Sx«(3) is the score statistic for (3 if x were available. A similar expression for the observed information /z(f3) is provided by Louis (34). In principle, the likelihood and score statistic can be extended to include data =

=

EXPOSURE MEASUREMENT ERROR

83

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

from validation studies, so that parameters in the measurement error model and the distribution of true exposure can be simultaneously estimated together with (3. SPECIFICATION OF THE DISTRIBUTION OF TRUE EXPOSURE The most difficult part of the specification of a measurement error model is the distribu­ tion of true exposures in the population. Because x is not observed, it is difficult to know which form to choose for Pr(x) . A practical approach is to inspect the distribution of z and choose a parametric form that appears appropriate. Once a measurement model Pr(zlx) is fixed, it is quite possible to consider maximum likelihood estimation of parameters in a model for Pr(x) on the basis of the z data alone. Alternatively, the estimation of parameters in the distribution of x may be done simultaneously with that of (3 from the likelihood (8). The calculation of E(x"lz) can be done by numerical integration once models for both Pr(x) and Pr(zlx) are selected. An alternative to estimating the parameters in a fixed distributional form for Pr(x) is a nonparametric approach that uses density estimation methods. One assumes that the distribution of x is concentrated at a finite number M of support points with corresponding masses, where M < N, the number of subjects. Methods for finding the maximum likelihood value of M and the positions and masses of the support points have been provided (5, lO, 32, 39, 54). Although the nonparameteric approaches to estimating Pr(x) are potentially of great interest, it is not known whether they lead to consistent estimates of {3, because M generally increases with N, thereby apparently violating the basic asymptotic conditions for the validity of maximum likelihood. The functional approach, where {3 is estimated by maximizing the likelihood with respect to each x rather than integrating over a prior distribution, gives inconsistent estimates in many measurement error problems (18). The obvious solution is to fix the number of potential support points to be substantially smaller than N. Simulation studies (54) have shown that the bias then declines with increasing N, holding M fixed. This approach was used in the analysis of dose-responsc relations in thc Utah fallout studies (54). In a case-control study of leukemia, doses were assigned based on residential histories abstracted from Mormon church rec­ ords, because it was not possible to contact the individuals for more detailed exposure information. The calculated uncertainties on the dose estimates allowed only for variation in deposition rates and incompleteness in some individuals' residence histories. The mean uncertainty was a geometric stan­ dard deviation (GSD) of about 1.5 with relatively little variation between individuals . Correction for these errors increased the slope estimate by 12%, but also increased the standard error by 53%, slightly decreasing the signifi-

84

THOMAS, STRAM & DWYER

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

cance of the dose-response relation. In the thyroid cohort study, a much more complex dose assignment algorithm was used, and the uncertainty analysis was more comprehensive. The mean uncertainty was a GSD of about 2.7 and varied more between individuals . Correction for these errors increased the slope estimate for thyroid neoplasms about 3-fold and the standard error about 4 .6-fold. MODELING EXPOSURE SO far, we have considered the situation in which the information available on x is one or more flawed measurements z. Another important class of information relates to determinants of true exposure, w , which w e wish to consider i n the exposure model Pr(xiw). A recent example comes from a study of childhood leukemia and residential electromagnetic fields (EMF) (33) . Here, x is some summary of long-term EMF exposure, z is a short-term measurement of EMF, and w represents various characteristics of the electrical wiring in the neighborhood that is the main source of the magnetic fields. A reasonable modeling strategy is to use the marginal model for ziw to estimate the parameters in E(xiw), and then use these expected values in fitting the disease model. For example, Jiang (30) assumed Xn N(w�y,c?-) and Zn - N(xn'w� (where x and z are now on a logarithmic scale) and fitted the regression model Zn - N(wTy,c?- + w1J. For exposure-response modeling, the expectation of the true dose is then simply E(xniwn) w�t. In this approach, the measurements z are used only to develop a prediction model and not to fit the disease model. This is clearly not fully efficient, but one could use E(xiz, w) instead. Thus, in the EMF example, =

9. where Vn = V(w�t) , which is the obvious weighted average of the predictions and the measurements . In Jiang's analysis, a considerably better prediction of z was obtained in the regression on w than for the Wertheimer-Leeper (WL) wiring code that had previously been associated with leukemia risk. However, neither E(xiw) nor E(xiz, w) were associated with leukemia, which suggests that measurement error alone could not explain the lack of association between leukemia and measured EMF. Alternative explanations for the WL-Ieukemia association, including selection bias or causal effects of some other aspect of EMF than mean intensity, are being explored. This process is essentially what is done in most occupational mortality studies in which w represents job titles and Z area measurements. The means of z by w are computed to develop a "job-exposure matrix," which is then used to compute predicted exposure histories xn(t) for each of the study

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

EXPOSURE MEASUREMENT ERROR

85

subjects from their job histories wn(t). These predicted exposure histories are then treated as the true exposure histories in standard epidemiologic analyses. This process provides an unbiased estimate of the true exposure-response relationship if the assumptions of a linear disease model and a Berkson error structure are true. In a study of the health effects of air pollution that is currently under way at the University of Southern California (USC) (37), we are using similar approaches to combine area monitoring data on community pollution levels with personal activity data on all subjects and with personal dosimetry and microenvironmental sampling on a subset of sUbjects. The personal dosime­ try, microenvironmental sampling, and activity data are used to develop a model for deviations between the area mean and personal exposures, which can then be applied to all subjects. The analysis of exposure-response rela­ tionships involves a combination of individual and ecologic regression an­ alyses, as described below. Examples ATOMIC BOMB SURVIVORS An illustration of the techniques under discus­ sion here is provided by recent work on the dosimetry error problem in the cohort study of the Japanese atomic bomb survivors (41). The DS86 dosimetry system (47), which produces dose estimates for approximately 100,000 survivors in Hiroshima and Nagasaki, takes as its input each sur­ vivor's location and shielding factors, such as the type of structure they were in and their orientation at the time of the bombing. These input data es­ sentially define a "mathematical box," which contains the survivor, and the distribution of radiation within this box is analyzed on the basis of physical principles. The dose estimates that are assigned are the averages of dose E(xlinput data) within each box. Because the primary interest in the analysis of cancer risk is on survival models that are linear or linear-quadratic in x, measurement error correction for these models is based on the calculation of E(xlz) and E(x2Iz) for use in the survival analysis. To approximate E(xlz) in the cohort, Pierce et al (41) assumed a Weibull distribution of true dose Pr(x) and a lognormal measurement error model Pr(zlx) with logarithmic standard deviation (T. They based their choice of true dose distribution principally upon the shape of the distribution of observed dose Pr(z) for the survivors, which is well described by a Weilbull shape. The two parameters of the Weibull for Pr(x) were estimated by a simple de­ convolution using the z data, based on a specific choice of (T in the measure­ ment error model. The resulting Weibull density for x has a shape that drops off very quickly with increasing dose, which is in conformity both with the geometry of the radiation fields produced by the blasts and with the heavy mortality near the hypocenters. The multiplicative nature of the lognormal

86

THOMAS, STRAM & DWYER

error model assumes that errors in doses increase in absolute magnitude with Pierce et al investigated the effects of measurement error on estimated cancer risk for a range of values of (J from 0.3 to 0. 4, corresponding to a coefficient of variation of VV(zlx) /x of roughly 30-40% of x. In their preferred 35% error model, adjustment for random dosimetry error increased estimates of dose-response slopes for cancer by approximately 7-13%. Three sources of information underlie this specific choice of measurement error variance. The first is the error analysis provided as a part of the dosimetry system itself, which can be regarded as an estimate of the variance of the dose caused by uncertainties in the physical model; in general, this component of variability is less than the 35% error model specifies. A second component is inaccuracies in the input data, which were only collected by interview five to ten years after the bombing. Estimates of these errors come from analyses of subsets of survivors who were interviewed more than once concerning their location at the time of the bombings (29). The third source of information concerning the variability of estimated dose comes from the analysis of biological outcome data itself. Data on radiation-induced stable chromosome aberrations have been collected on more than 1 000 survivors. Typically, 100 cells per survivor are examined, and the fraction of cells showing stable chromosome aberrations is counted. As noted above, the effect of measurement error will be to introduce over-dispersion in such binomial outcomes. The amount of dispersion observed in the chromosome data was used by Sposto et al (52) to provide an estimate of the variance of estimated dose and an analysis of induced association between reports of early effects of radiation, specifically severe epilation, and stable chromosome aberrations. They noted a significant twofold difference in the apparent dose-response for the induction of chromosome aberrations between those survivors who reported severe epilation and those who did not. As noted above, such correlation between multiple endpoints can be produced by dosimetry error alone, even if the actual dose-response for the two groups does not differ. Sposto et al provided a second estimate of the dosimetry measurement error variance, based on the assumption that the observed difference in dose response is caused by measurement error alone. It appeared that at least a 40% lognormal error model would be required to explain both the observed dispersion in the chromosome aberration data and the difference in slope between groups, if there were in fact no heterogeneity in radiosensitivity.

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

x.

DIET Armstrong et al (2) discussed adjustment for measurement errors in multiple exposure variables by using Equation 4 and illustrated their approach on a study of colon cancer in relation to calories, protein, and fat. Estimates of n and � came from a multivariate analysis of variance of data from a validation substudy (17). In their analysis, multivariate adjustment for

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

EXPOSURE MEASUREMENT ERROR

87

measurement errors had a substantially larger effect on the estimated coeffi­ cients than did adjustment for each effect separately in a univariate model. Their analysis also allowed for the possibility that controls might differ by a mean vector 8 because of recall bias, which could also be estimated from a validation substudy of cases and controls. In the univariate analyses, adjust­ ment for recall bias reduced the coefficients for all exposure variables; in the multivariate analyses, however, this adjustment increased the coefficient for fat and calories slightly, mainly as a result of the much larger effect on the negative coefficient for protein. We illustrate the structural equations approach by using data from the Honolulu Heart Program (HHP). An average of seven food records are available for a subsample of 329 men from the larger HHP cohort. A single 24-hour dietary recall is available on all men. The current analysis uses only the data from the 329 men in the validation sample (D. Reed 1992, personal communication). A structural model of the form of Equations 5 and 6 for relating body mass index (BMI) to total caloric intake (Cal) and calories from fat (Fat) can be constructed as follows:

+ {3IXI + {32X2 + = 'YI + A1xI + ZI Z2 = 'Y2 + A2xI + Z3 = Cal24hr 'Y3 + A�2 + Z4 = Ca17day 'Y4 + A¥2 + BMI =

=

=

a

Fat24hr Fat7day

=

=

E '1/1 '1/2 '1/3 '1/4'

10. 11.

With Aj 1 for allj and all covariances zero, the fit of the model was poor (,i 303, df = 5), but when the Cov( '1/1> '1/3) and Cov( '1/2, '1/4) were also es­ timated, the fit was much improved (,i = 2.9, df = 3; p > . 05). The resulting .121 (SE 0.036), ML estimates were /31 = 0.518 (SE 0.171) and /32 where the units of the slopes are kg/lOOO kcal. In contrast, regression of BMI on ZI and Z3 gave /31 = 0.267 (0.074) and /32 = -.0120 (0.037), which suggests that a substantial attenuation of slopes occurs when using the 24-hour recall measures. The two-stage estimation procedure gave /31 = 0.889 and /32 -0.279. The ML analysis also gave estimates of V(Xk) and V( '1/) from which the correlations between Z and x can be computed. Over 75% of the variance in a single 24-hour recall of fat was estimated to be error, whereas for the average of seven daily records that error was reduced to about 50%. =

=

=

=

=

=

PRACTICAL IMPLICATIONS Design Issues Freedman et al (15) have discussed the impact of measurement error on sample size needs for cohort studies. For

OPTIMAL SAMPLE SIZE ALLOCATION

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

88

THOMAS , STRAM & DWYER

example, if the correlation between x and z is 0. 65 , they found that the sample size would have to be six to eight times larger than if there were no measure­ ment errors. Traditionally, inference on exposure-response relationships has been based on the naive regression on measured exposures, supplemented with a quali­ tative discussion of measurement error issues based on separate validation or reproducibility studies. The modem approach advocated here would shift the primary emphasis to error-corrected exposure-response relationships. Be­ cause this relies on estimates of measurement error distributions, there is a corresponding shift in emphasis onto validation and reliability studies, even at the expense of resources available for the main study. This naturally raises the problem of optimizing the design of a single study that tries to accomplish both aims within a limited budget. Greenland (23) and Spiegelman & Gray (5 1) considered the problem of minimizing the variance of an adjusted slope estimate, subject to a constraint on the total cost, the former in the case of a binary exposure variable, the latter for a normally distributed exposure. Both concluded that the optimal design depends in a complex fashion on a number of parameters, but found plausible scenarios under which the most efficient design was a smaller "fully validated" stUdy. Explicit formulae for determining the optimal design can be found in their respective papers. Simulation approaches are being applied in the USC air pollution study to de­ cide upon the most efficient trade-off between area monitoring and personal dosimetry (37). Rosner & Willett (49) considered the efficient design of studies aimed at estimating correlation coefficients corrected for measurement error estimates derived from reproducibility studies. This involves a similar trade-off be­ tween the number of subjects and the number of replicate measurements per subject. They concluded that to minimize the standard error of the corrected correlation , it was generally sufficient to have no more than five replicates per subject, and if the true correlation were large, two replicates would suffice. ECOLOGIC VERSUS ANALYTIC STUDIES Analytic studies have traditionally been viewed as providing a firmer basis for causal inference than ecologic studies , largcIy because of concerns about the "ecologic fallacy. " Recently , several authors (16, 25 , 42, 44) have reexamined this conventional wisdom in the light of measurement error considerations. There are numerous examples in the epidemiologic literature of discrepancies between the inferences based on the two types of designs. In the area of diet and cancer, for example, Prentice & Sheppard (45) have contrasted the strong international correlations between cancer rates and fat consumption rates with the weak relations reported in most case-control and cohort studies. The former is potentially

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

EXPOSURE MEASUREMENT ERROR

89

confounded by numerous risk factors that are either unrecognized or not easily measured at the aggregate level, but because a Berkson error structure applies may be less affected by measurement error. The analytic studies are more easily controlled for confounding, but are limited by the restricted range of diets within countries and racial groups and by the difficulties in measuring individual' s usual diet. Similar issues have been discussed in the context of air pollution effects (37). Because air pollution levels are geographically de­ termined, it is very difficult to assess exposure-response relations at an individual level, and most studies have relied on between-communities re­ gressions of average health outcomes (e .g. symptom prevalence, mean lung function changes) on average pollution levels . A multicenter analytic (cohort or case-control) study would appear to incorporate the best features of both designs. By including multiple centers, the problem of restricted variability in exposure can be overcome. By con­ trolling confounders at an individual level, the major concern with the ecolog­ ic fallacy can be overcome, at least to the extent that the relevant confounders can be identified and measured. By measuring exposures at an individual level, one can often obtain a better estimate of the relevant population mean exposures for ecologic comparisons than would be possible by using routine data sources. Finally, by analyzing exposure-response relations at the ecolog­ ic level, one can overcome the attenuation bias from measurement error, provided a Berkson error structure holds . To realize all these advantages of the hybrid design, an appropriate analysis must be done to combine the individual and ecologic comparisons. The general approach can be illustrated by using the Harvard Six Cities study of air pollution (56). Letting SUbscript m denote the centers and n the in­ dividuals, we might write the general model as

Yom = a + {3xm

+

")'Vmo

+ 8m +

€mo

where xm denotes the mean exposure in center m (from area monitoring) ' Vmo denotes the confounding variables assessed at the individual level, and 8m and €mn are random error terms for unmeasured center and individual effects . To fit the model, a two-stage method was used. In the first stage, an individual­ level regression was done, omitting the xm but including a set of indicator variables for center to estimate the 8m. In the second stage, these estimated center residuals 8m were then regressed on the Xm, by using a weighted least-squares analysis that incorporated their estimated variances. This gener­ al approach could be extended to incorporate individual-level exposure data with corrections for measurement error. In particular, one might wish to test the hypothesis that the error-corrected regressions on the individual and area exposure data have the same slope. Such approaches are being developed for

90

THOMAS, STRAM & DWYER

the USC air pollution study to incorporate the personal dosimetry and micro­ environmental sampling data diet and cancer

(37) and for the European collaborative study of

(42).

Implications for Regulatory Policy Although epidemiologists have recognized the impact of measurement error on exposure-response relationships , this has not usually carried over into

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

regulatory policy. Government agencies routinely use upper confidence limits on slope coefficients to be health conservative, but the uncorrected slope estimate is always used. This may not, therefore, be conservative. Before concluding that most risk assessments that have ignored measurement errors are inadequate, however, two qualifications must be emphasized. First, many exposure-response relationships are derived from studies in which a Berkson error structure applies (e.g. occupational mortality studies using job-exposure matrices) and thus may not be seriously distorted by measurement error. Second, the attenuation bias applies to the estimation of relations with true exposure, but the agency may not be able to regulate the true exposure. At issue is whether one wishes to predict the change in risk that would result from a change in true exposure or from a change in measured exposure. Because the reliability of measurements change from one setting to another, and because most regulatory poliCies aim to control average exposure levels rather than individually measured exposures, standards should in most in­ stances be based on error-corrected exposure-response relationships. Howev­ er, an exception might be when the standard is based on individual exposures that are subject to the same errors as in the epidemiologic data base.

RESEARCH NEEDS Adjustment for measurement errors has become an active research area only recently. The techniques are still in their infancy and have seldom been applied in routine epidemiologic studies. Nevertheless, the area appears to be very promising, and hopefully the availability of these techniques will encour­ age epidemiologists to devote greater efforts to quantifying their measurement errors and correcting for them. A high priority for future research is the development of practical procedures and their incorporation into widely available statistical software

(46).

On the theoretical level , techniques appear to be much better developed at the individual level than at the ecologic leveL Given the considerations discussed above that favor greater use of ecologic studies and exposure models, the development of the approaches for combining individual and aggregate level analyses and incorporating instrumental variables should be a

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

EXPOSURE MEASUREMENT ERROR

91

priority. A particularly challenging problem i n this context i s the case of correlated errors between individuals within the same group. Finally , only a few papers have addressed the important design issues of optimizing the various trade-offs, such as between validation and main study sample sizes, between numbers of individuals and numbers of replicate measurements in reproducibility studies, and between numbers of individuals and numbers of centers in multicenter studies. Practical methods for sample size and power determination in studies where measurement error correction methods are planned would also be very helpful .

Literature Cited 1 . Armstrong , B . G. 1 990. The effects of measurement errors on relative risk re­ gressions. Am. J. Epidemiol. 1 32: 1 1 7684

2. Armstrong, D . B . , Whittemore , A. S . ,

Howe, G. R . 1989. Analysis of case­ control data with covariate measurement error: application to diet and colon can­ cer. Stat. Med. 8 : 1 1 5 1 -66 3 . Berkson, J . 1 950. Are there two regres­ sions? J. Am. Stat. Assoc. 45: 1 64-1 80 4. Brenner, H . , Savitz, D. A . , Jockel, K.­ H . , Greenland, S . 1 992. Effects of non­ differential exposure misclassification in ecologic studies. Am. J. Epidemiol. 1 35 :85-95

R. J . , Wand, M. P. 1 99 1 . Semiparametric estimation in logistic measurement error models. J. R. Stat.

5. Caroll,

Soc. Ser. B 53(3):573-85 6. Clayton, D. G. 1 99 1 . Models for the

analysis cohort and case-control studies with inaccurately measured exposures. Tn Statistical Models for Longitudinal Studies of Health, ed. J. H . Dwyer, P. Lippert, M . Feinleib, H . Hoffmeister, pp. 30 1 -3 1 . Oxford: Oxford Univ. Press 7 . Cochran , W . C. 1 968. Errors of mea­ surement in statistics. Technometrics 1 0:637-66 8. Copeland,

K. T . , Checkoway, H . , McMichael, A . J . , Holbrook, R. H . 1 977. Bias due t o misclassification in the estimation of relative risk. Am. J.

Epidemiol. 105 :488-95 9. Dempster, A. P . , Laird, N, M . , Rubin, D. B . 1 977 . Maximum likelihood from

incomplete data via the EM algorithm.

J. R . Stat. Soc. Ser. B 39: 1 -38 1 0 . DerSimonian, R . 1 986. Algorithm 22 1 :

maximum likelihood estimation of a mixing distribution. Appl. Stat. 35:302-

9

1 1 . Dosemici, M . , Wacholder, S . , Lubin, J . H . 1 990. Does nondifferential mis-

classification always bias a true effect toward the mill value. Am. J. Epidemiol.

132:746-48 1 2 . Dwyer, J. H. 1 983. Statistical Models.

New York: Oxford Univ. Press

1 3 . Dwyer, J. H . , Li, L . , Curtin, R . , Fein­

leib, M. 1992. Correction for non­ sampling measurement error in risk fac­ tors for CHD in women from Framing­ ham. Circulation 85:879 14. Elton, R. A . , Duffy, S. W. 1 983. Cor­ recting for the effect of miscIassification bias in a case-control study using data from two different questionnaires.

Biometrics 39:659-65 1 5 . Freedman, L. S . , Schatzkin, A . , Wax, Y . 1 990. The impact of dietary measure­

ment error on planning sample size re­ quired in a cohort study. Am. J.

l:'pidemiol. 1 1 8 : 1 1 85-95 1 6 . Freudenheim, J. L . , Marshall, 1. R . 1 988. The problem o f profound mis­

measurement and the power of epidemi­ ological studies of diet and cancer. Nutr.

Cancer 1 1 :243-50 1 7 . Friedenreich, C. M . , Howe, G. R. , Mil­ ler, A . B. 1 99 1 . An investigation of re­

call bias in the reporting of past food intake among breast cancer cases and controls. Ann. Epidemiol. 1 :439-53 1 8 . Fuller, W. A. 1 987. Measurement Error Models. New York: Wiley 1 9 . Fung, K. Y . , Howe, G. R. 1984. Methodological issues in case-control studies III: the effect of joint mis­ classification of risk factors and con­ founding factors upon estimation and power. Int. 1. Epidemiol. 1 3:366-70 20. Gladen, B . , Rogan, W. 1. 1 979. Mis­ classification and the design of environ­ mental studies. Am. 1. Epidemiol.

1 1 0:607-1 6 2 1 . Greenland, S . 1 980. The effect o f mis­

classification in the presence of covari­ ates. Am. J. Epidemiol. 1 1 2:564-69

92

THOMAS, STRAM & DWYER

22. Greenland, S. 1982. The effect of mis­

classification in matched-pair case­ control studies. Am. J. Epidemiol. 1 1 6:

402-6 23. Greenland, S.

1 988. Statistical uncer­ tainly due to misclassification: im­ plications for validation substudies. J.

CUn. Epidemiol. 4 1 : 1 1 67-74 24. Greenland, S. 1 98 8 . Variance estima­

tion for epidemiologic effect estimates under misclassification. Stat. Med. 7 :745-5 7

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

25. Greenland, S. 1 992. Divergent biases in

ecological and individual-level studies .

Stat. Med. 1 1 : 1 209-23 26. Greenland, S . , Kleinbaum. D. G. 1 983.

Correcting for misclassification in two­ way tables and matched-pair studies.

Int. J. Epidemiol. 1 2:93-97 27. Hatch, M . , Thomas, D. C.

1 992.

Measurement of exposure, dose, covari­ ates and outcome in environmental epi­ demiology. Environ . Health Perspect. In press 28. Hui, S. S. L. , Walter, S. D. 1 980. Es­ timating the error rates of diagnostic tests. Biometrics 36: 1 67-7 1 29. Jablon, S. 1 97 1 . Atomic Bomb Radia­

tion Dose Estimates at ABCC (TR 2371 ) . Hiroshima: At Bomb Casualty Comm.

30. Jiang, F. 1 992. Prediction of Magnetic Fields from Wiring Configurations in a Case-Control Study of Childhood Leuke­ mia. Los Angeles: USC Dept. of Prev .

Med.

3 1 . Joreskog, K . G . , Sorbom, D. 1988. US­ REL 7: A Guide to the Program and Applications. Chicago: SPSS 32. Laird, N. 1 978. Nonparametric max­

imum likelihood estimation of a mixing distribution. J. Am. Stat. Assoc. 73:8051l 33. London, S. J . , Thomas, D. c . , Bow­ man, J. D . , Sobel, E . , Cheng, I . -C . , Peters, I. M . 1 99 1 . Exposure to residen­ tial electric and magnetic fields and risk of childhood leukemia. Am. J. Epidemi-

01. 1 34:923-37

34. Louis, T. A. 1 982. Finding the observed

information matrix when using the EM algorithm. J. R. Stat. Soc. Ser. B 44:

226-33 3 5 . Marshall,

I. R . , Graham, S. 1 984. Use of dual responses to increase validity of case-control studies. J. Chron . Dis. 37:

1 25-36 36. Muthen, B . 1 987. USCOMP: Analysis of Linear Structural Equations with a Comprehensive Measurement Model.

Mooresville, Ind: Scientific Software 37. Navidi, W . C . , Stram, D . O . , Thomas, D. C. 1 992. Statistical Methods for Epidemiologic

Studies

of the Health

36). Los Angeles: USC Dep. of Prev. Med. 38. Neriishi, K . , Stram, D. O . , Vaeth, M . , Mizuno, S . , Akiba, S . 1 99 1 . The ob­ served relationship between the occur­ rence of acute radiation sickness and subsequent leukemia mortality in the Hiroshima-Nagasaki data. Radiat. Res. Effects of Air Pollution (TR

1 25:206- 1 3 3 9 . Pepe, M . S . ,

Fleming, T. 1 99 1 . A nonparametric method for dealing with mismeasured covariate data. J. Am.

Stat. Assoc. 86: 108-13 40. Pepe, M . S., Self, S . G., Prentice, R. L. 1 9 8 9 . Further results on covariate mea­

surement errors in cohort studies with time to response data. Stat. Med. 8 : 1 1 67-78 4 1 . Pierce, D. A . , Stram, D . O . , Vaeth, M . 1990. Allowing for random errors in radiation exposure estimates for the atomic bomb survivor data. Radiat. Res. 1 23 :275-84 42. Plummer, M . , Clayton, D. 1 99 1 . As­ sessing measurement errors of dietary survey methods in nutritional epidemiol­ ogy. Presented at Stat. and Epidemiol.

Aspects of Cancer Res . , Nuffeld Col­ lege, Oxford 43. Prentice, R. L . 1 982. Covariate mea­ surement errors and parameter estima­ tion in a failure time regression model.

Biometrika 69:33 1-42 44. Prentice, R. L . , Sheppard, L.

1 990.

Causes Control 1 : 8 1-97 45 . Prentice, R . L . , Sheppard , L.

1 99 1 .

Dietary fat and cancer: consistency of the epidemiologic data, and disease pre­ vention that may follow from a practical reduction in fat consumption. Cancer

Dietary fat and cancer: rejoinder and dis­ cussion of research strategies. Cancer

Causes Control 2:53-58 46. Prentice , R. L . , Thomas, D. C. 1 992.

Methodologic research needs in environ­ mental epidemiology: data analysis. En­ viron. Health Perspect. In press 47. Roesch, W. C. 1978. Final Report of US-Japan Joint Reassessment of Radia­ tion Dose Estimates for the Atomic Bomb Survivor Data. Hiroshima: Radi­

at. Effects Res. Found.

48 . Rosner, B . , Spiegelman, D . , Willett, W. C. 1 990. Correction of logistic

regression relative risk estimates and confidence intervals for measurement error: the case of multiple covariates measured with error. Am. J. Epidemiol.

1 32:734-45 49. Rosner, B . , Willett, W. C.

1 98 8 . In­ terval estimates for correlation coeffi­ cients corrected for within-person varia­ tion: implications for study design and hypothesis testing. Am. J. Epidemiol. 1 27:377-86

Annu. Rev. Public. Health. 1993.14:69-93. Downloaded from www.annualreviews.org by NORTH CAROLINA STATE UNIVERSITY on 09/07/11. For personal use only.

EXPOSURE MEASUREMENT ERROR 50. Rosner, B., Willett, W . S., Spiegelman, D. 1 989. Correction of logistic regres­ sion relative risk estimates and confi­ dence intervals for systematic within­ person measurement error. Stat. Med. 8 : 1 03 1-40 5 1 . Spiegelman, D . , Gray, R. 1 99 1 . Cost efficient study designs for binary re­ sponse data with Gaussian covariate measurement error. Biometrics 47:85 1 69 52. Sposto, R . , Stram, D. O . , Awa, A. A. 1 99 1 . An estimate of the magnitude of random errors in the DS86 dosimetry from data on chromosome aberrations and severe epilation. Radiat. Res. 1 28: 1 57-69 53. Stevens, W . , Thomas, D . c . , Lyon, J. L . , et al . 1990. Leukemia in Utah and

93

radioactive fallout from the Nevada Test Site: a case-control study. N. Engl. J. Med. 264:585-9 1 54. Thomas, D. C . , Gauderrnan, J . , Kerber, R. 1993. A nonparametric Monte Carlo approach to adjustment for covariate measurement errors in regression. Bio­ metrics . In press 55. Wacholder, S . , Dosemeci, M. , Lubin, J. H. 1 99 1 . Blind assignment of expo­ sure does not always prevent differential misclassification. Am. J. Epidemiol. 1 34:433-37 56. Ware , J. H . , Ferris , B. G . , Dockery , D. W., Spengler, J. D . , Strom, D . O . , Speizer, F. E. 1986. Effects of ambient sulfur oxides and suspended particles on respiratory health of preadolescent chil­ dren. Am. Rev. Respir. Dis. 1 33 :834-42