Sept98 FM - Part III - Semantic Scholar

15 downloads 0 Views 50KB Size Report
funnel plots and L'Abbé plots. ... The L'Abbé plot best addresses this question about ... Egger M, Davey-Smith G. Bias in location and selection of studies.
Vol. 30, No. 8

579

Research Series

Graphical Methods for Detecting Bias in Meta-analysis Robert L. Ferrer, MD, MPH

The trustworthiness of meta-analysis, a set of techniques used to quantitatively combine results from different studies, has recently been questioned. Problems with meta-analysis stem from bias in selecting studies to include in a meta-analysis and from combining study results when it is inappropriate to do so. Simple graphical techniques address these problems but are infrequently applied. Funnel plots display the relationship of effect size versus sample size and help determine whether there is likely to have been selection bias in including studies in the meta-analysis. The L’Abbé plot displays the outcomes in both the treatment and control groups of included studies and helps to decide whether the studies are too heterogeneous to appropriately combine into a single measure of effect. (Fam Med 1998;30(8):579-83.)

Our faith in the answers provided by scientific inquiry rests on our confidence that its methods are sound. If we lose confidence in a particular method, we may begin to doubt a whole series of previously established “truths.” Such skepticism1 has arisen about meta-analysis, the “study of studies” designed to combine the results of a number of different reports into one report to create a single, more precise estimate of an effect. This article will briefly outline why skepticism has developed about meta-analysis and will present two graphical methods to evaluate the validity of a meta-analysis. First developed by social scientists as a way to integrate and summarize evidence from multiple studies, meta-analysis forms the statistical core of the science of systematic review. The central idea behind systematic reviews is that no single study can provide a definitive answer to a medical question. Rather, progressive understanding develops through a process of synthesizing and integrating the results from

From the Department of Family Practice, University of Texas Health Science Center at San Antonio.

multiple studies to see the patterns that clarify a line of scientific inquiry. As a way of summarizing research, the systematic review is considered superior to the traditional expert narrative review,2 which is vulnerable to the biases of its authors. Enthusiasm for meta-analysis is based on the fact that it provides a quantitative synthesis of all the relevant and methodologically acceptable evidence.3 This enthusiasm is reflected in the logarithmic growth of published meta-analyses, from 16 in all of the 1970s to more than 500 in 1996.4 The Cochrane Collaboration, a worldwide effort to produce systematic reviews to guide evidence-based practice, further exemplifies a strong faith in the validity of meta-analysis. Not everyone is so positive about meta-analysis, however. Critics have argued that attempting to summarize the results of different studies with a single measure leads to a substantial risk of error,5-7 and empirical evidence has recently emerged to support their doubts. Several studies have demonstrated discordant results in 10%–34% of comparisons between meta-analyses and large randomized clinical trials (RCTs) on the same topic.8-10 Though there is no a priori basis for believing that RCTs are always correct when a discrepancy exists, the 50-year history of

580

September 1998

Family Medicine

building knowledge with RCTs has weighed in their favor, and the validity of meta-analyses has become suspect. Bias in Meta-analysis The most important biases in meta-analysis arise from two sources. These are 1) the choice of studies included in the meta-analysis11 and 2) how the results of those studies are combined to produce a summary effect estimate.7 Just as bias in selecting patients may flaw an epidemiological study, bias in selecting studies may flaw a meta-analysis. Selection bias in meta-analysis derives from several potential problems with literature searches,11 including publication bias, citation bias, multiple publication, and English-language bias (see Table 1 for definitions). Each of these biases increases the likelihood that meta-analysis will include studies that demonstrate statistically significant differences and fail to include those that do not. The other major bias occurs when individual studies are combined despite significant heterogeneity in their results. In such cases, summary measures are misleading and can mask important subgroup differences in outcome. For instance, variation in treatment effect by patient age would be hidden by an analysis that pooled the results of studies done in young and old populations. Or, the existence of a threshold dose would be obscured by an analysis that combined low-dose and high-dose treatment studies. Unfortunately, most metaanalyses are not presented in a way that permits readers to assess whether biases are present, even though there are simple graphical methods that help to do so. These graphical methods include funnel plots and L’Abbé plots. The Funnel Plot The funnel plot is a simple visual tool to examine whether a meta-analysis is based on a biased sample of studies.12 The funnel plot is a scatterplot of effect size versus sample size, with each data point on the plot representing one study. Because large studies estimate effect size more precisely than small studies, they tend to lie in a narrow band at the top of the scatterplot, while the smaller studies, with more variation in results, fan out over a larger area at the bottom, thus creating the visual impression of an inverted funnel. When a meta-analysis

Table 1 Sources of Bias in Meta-analysis Type of Bias Publication bias20

Definition Positive studies* are submitted for publication more frequently and are more easily published than negative studies.**

Multiple publication21

Duplicate publication of study results leads to over-sampling of data from the same research.

English-language bias22

Negative studies are less likely to be published in English-language journals than positive studies. Thus, Englishlanguage literature searches fail to retrieve negative studies.

Citation bias23

Positive studies are cited more frequently and thus are more easily identified than negative studies.

* Positive studies—studies that demonstrate statistically significant differences ** Negative studies—studies that do not demonstrate statistically significant differences

Figure 1 Symmetrical Funnel Plot

includes an unbiased sample of studies, including studies that find both positive and negative results, the funnel is symmetrical. Figure 1 shows a schematic representation of a symmetrical funnel plot. When the sample of studies is biased, however, the funnel will be asymmetrical. This asymmetry is most commonly the result of smaller studies being biased toward positive results and larger effect sizes, due to

Research Series

Vol. 30, No. 8

Figure 2 Asymmetrical Funnel Plot: Meta-analysis of Calcium and Preeclampsia

Figure 3 Funnel Plot: Study Result Versus Year of Publication

581

the biases listed in Table 1. Figure 2 shows a funnel plot created from a meta-analysisl3 that found that calcium supplementation during pregnancy was associated with a lower risk of preeclampsia. Marked asymmetry in the funnel plot suggests biased inclusion of studies that showed beneficial effects from calcium. In fact, 15 months after the publication of that metaanalysis, a large randomized trial14 found no benefit from calcium supplementation in preventing preeclampsia. In many cases, visually inspecting a funnel plot suffices to draw conclusions about possible bias. But, if uncertainty persists about its symmetry, statistical techniques are available to test the hypothesis that a funnel plot is symmetrical.12 Funnel plots may also be used to examine whether a line of inquiry is converging on a more precise effect estimate over time.15 In this case, effect estimate is plotted against year of publication. The schematic example in Figure 3 shows that the most recent studies have less variability in their results than older studies, suggesting that current research methods are providing more accurate results than older methods.

The L’Abbé Plot The L’Abbé plot 16 (pronounced lah-bay) is another type of scatterplot in which each data point represents one study. The event rate in the control group is plotted against the event rate in the treatment group (the event rate can be either the proportion or the rate at which the endpoint occurred in each group). If the treatment group has better outcomes than the control group, data points fall below the line of identity, which has a slope = 1. If the control has better outcomes than the treatment, the data points fall above the line of identity (Figure 4). The L’Abbé plot best addresses this question about a meta-analysis: how does the treatment effect relate

582

September 1998

Family Medicine

Figure 4 L’Abbé Plot

Figure 5 L’Abbé Plot: Consistent Treatment Effect

to baseline risk? If the relationship is similar from study to study, the data points will form a consistent band on the scatterplot, as in the schematic L’Abbé plot in Figure 5, and it is appropriate to calculate a summary effect measure. However, if the plot demonstrates that treatment effect varies according to baseline risk, then a summary effect measure is misleading. For example, if the studies17 in Figure 6 were to be condensed into a summary effect measure, that summary would obscure the observation that there appears to be a threshold baseline risk of coronary heart disease to realize benefits from treatment. Neutral or harmful effects from treatment are evident at control group rates below 50 per 1,000 person/years, while beneficial effects emerge when baseline risk exceeds 50 per 1,000 person/ years. Such inconsistencies in the treatment effect’s relationship to the baseline event rate should raise doubts about the credibility of a single summary effect measure.

Conclusions Meta-analysis is a potentially valuable technique to synthesize research data from multiple studies. Its methods are still being refined; published metaanalyses in medicine number slightly more than 2,000, compared with more than half a million RCTs.4 Thus, it should not be surprising that the results of RCTs are perceived as more trustworthy. Meta-analysis will evolve as new standards develop,l8 and techniques such as meta-regression reach beyond summary effect measures to examine subgroup differences and interaction effects.4,19 As this evolution occurs, diagnostic aids such as funnel plots and L’Abbé plots will permit clinicians and researchers to better evaluate the results of current meta-analyses. Authors of meta-analyses should provide these graphs as a standard part of their data presentation, and readers should expect to see them.

Research Series

Vol. 30, No. 8

583

Figure 6 L’Abbé Plot of Coronary Heart Disease Trials

Correspondence: Address correspondence to Dr Ferrer, University of Texas Health Science Center at San Antonio, Department of Family Practice, 7703 Floyd Curl Drive, San Antonio, TX 78284-7795. 210-3583998. Fax: 210-220-3763. E-mail: [email protected].

REFERENCES 1. Bailar JC. The promise and problems of meta-analysis. N Engl J Med 1997;337:559-60. 2. Cook DJ, Mulrow CD, Haynes RB. Systematic reviews: synthesis of best evidence for clinical decisions. Ann Intern Med 1997;126: 376-80. 3. Egger M, Davey-Smith G. Meta-analysis: potentials and promise. BMJ 1997;315:1371-4. 4. Lau J, Ioannidis JPA, Schmid CH. Summing up evidence: one answer is not always enough. Lancet 1998;351:123-7. 5. Thompson SG, Pocock SJ. Can meta-analyses be trusted? Lancet 1991;338:1127-30. 6. Shapiro S. Meta-analysis/schmeta-analysis. Am J Epidemiol 1994; 140:771-8. 7. Eysenck HJ. Meta-analysis and its problems. BMJ 1994;309: 789-92. 8. Villar J, Carroli G, Belizan JM. Predictive ability of meta-analyses of randomized controlled trials. Lancet 1995;345:772-6. 9. Cappelleri JC, Ioannidis JPA, Schmid CH, et al. Large trials versus meta-analysis of smaller trials: how do their results compare? JAMA 1996;276:1332-8. 10. LeLorier J, Gregoire G, Benhaddad A, Lapierre J, Derderian F. Discrepancies between meta-analyses and subsequent large randomized, controlled trials. N Engl J Med 1997;337:536-42.

11. Egger M, Davey-Smith G. Bias in location and selection of studies. BMJ 1998;316:61-6. 12. Egger M, Davey-Smith G, Schneider M, Minder C. Bias in metaanalysis detected by a simple graphical test. BMJ 1997;315:629-34. 13. Bucher HC, Guyatt GH, Cook RJ, et al. Effect of calcium supplementation on pregnancy-induced hypertension and preeclampsia. JAMA 1996;275:1113-7. 14. Levine RJ, Hauth JC, Curet LB, et al. Trial of calcium to prevent preeclampsia. N Engl J Med 1997;337:69-76. 15. Light R, Pillemer D. Summing up. Cambridge, Mass: Harvard University Press, 1984. 16. L’Abbe KA, Detsky AS, O’Rourke K. Meta-analysis in clinical research. Ann Intern Med 1987;107:224-33. 17. Davey-Smith G, Song F, Sheldon TA. Cholesterol lowering and mortality: the importance of considering initial level of risk. BMJ 1993; 306:1367-73. 18. Pogue J, Yusuf S. Overcoming the limitations of current metaanalysis of randomized controlled trials. Lancet 1998;351:47-52. 19. Davey-Smith G, Egger M, Phillips AN. Meta-analysis: beyond the grand mean? BMJ 1997;315:1610-4. 20. Dickersin K, Min YI, Meinert CL. Factors influencing publication of research results: follow-up of applications submitted to two institutional review boards. JAMA 1992;267:374-8. 21. Huston P, Moher D. Redundancy, disaggregation, and the integrity of medical research. Lancet 1996;347:1024-6. 22. Egger M, Zellweger T, Antes G. Language bias in randomized controlled trials published in English and German. Lancet 1997;350:326-9. 23. Ravnskov U. Cholesterol-lowering trials in coronary heart disease: frequency of citation and outcome. BMJ 1992;305:15-9.