Diagnostic performance of body mass index to identify obesity as

0 downloads 0 Views 608KB Size Report
Feb 2, 2010 - obesity as defined by body adiposity: a systematic review and meta-analysis. DO Okorodudu1, MF Jumean2, VM Montori3, A Romero-Corral2, ...
International Journal of Obesity (2010) 34, 791–799 & 2010 Macmillan Publishers Limited All rights reserved 0307-0565/10 $32.00 www.nature.com/ijo

REVIEW Diagnostic performance of body mass index to identify obesity as defined by body adiposity: a systematic review and meta-analysis DO Okorodudu1, MF Jumean2, VM Montori3, A Romero-Corral2, VK Somers2, PJ Erwin4 and F Lopez-Jimenez2 1

University of Missouri School of Medicine, Columbia, MO, USA; 2Division of Cardiovascular Diseases, Department of Internal Medicine, Mayo Clinic College of Medicine, Mayo Foundation, Rochester, MN, USA; 3Division of Endocrinology, Department of Internal Medicine, Mayo Clinic College of Medicine, Mayo Foundation, Rochester, MN, USA and 4 Mayo Clinic Libraries, Mayo Clinic College of Medicine, Mayo Foundation, Rochester, MN, USA Objective: We performed a systematic review and meta-analysis of studies that assessed the performance of body mass index (BMI) to detect body adiposity. Design: Data sources were MEDLINE, EMBASE, Cochrane, Database of Systematic Reviews, Cochrane CENTRAL, Web of Science, and SCOPUS. To be included, studies must have assessed the performance of BMI to measure body adiposity, provided standard values of diagnostic performance, and used a body composition technique as the reference standard for body fat percent (BF%) measurement. We obtained pooled summary statistics for sensitivity, specificity, positive and negative likelihood ratios (LRs), and diagnostic odds ratio (DOR). The inconsistency statistic (I2) assessed potential heterogeneity. Results: The search strategy yielded 3341 potentially relevant abstracts, and 25 articles met our predefined inclusion criteria. These studies evaluated 32 different samples totaling 31 968 patients. Commonly used BMI cutoffs to diagnose obesity showed a pooled sensitivity to detect high adiposity of 0.50 (95% confidence interval (CI): 0.43–0.57) and a pooled specificity of 0.90 (CI: 0.86–0.94). Positive LR was 5.88 (CI: 4.24–8.15), I2 ¼ 97.8%; the negative LR was 0.43 (CI: 0.37–0.50), I2 ¼ 98.5%; and the DOR was 17.91 (CI: 12.56–25.53), I2 ¼ 91.7%. Analysis of studies that used BMI cutoffs X30 had a pooled sensitivity of 0.42 (CI: 0.31–0.43) and a pooled specificity of 0.97 (CI: 0.96–0.97). Cutoff values and regional origin of the studies can only partially explain the heterogeneity seen in pooled DOR estimates. Conclusion: Commonly used BMI cutoff values to diagnose obesity have high specificity, but low sensitivity to identify adiposity, as they fail to identify half of the people with excess BF%. International Journal of Obesity (2010) 34, 791–799; doi:10.1038/ijo.2010.5; published online 2 February 2010 Keywords: adiposity; body composition; body mass index; BMI; fat mass

Introduction Obesity has become one of the most important threats to human health worldwide. According to the data derived from the Third National Health and Nutrition Examination Survey, the prevalence of obesity in the United States of America is 31.1% in men and 33.2% in women.1 Regardless

Correspondence: Dr F Lopez-Jimenez, Division of Cardiovascular Diseases, Mayo Clinic College of Medicine, 200 First Street SW, Gonda 5-368, Rochester, MN 55905, USA. E-mail: [email protected] Received 18 August 2009; revised 21 December 2009; accepted 26 December 2009; published online 2 February 2010

of the multiple efforts made to address this public health issue, the prevalence of obesity continues to rise.1 Abundant scientific evidence supports the associations between obesity and various disease including diabetes mellitus, hypertension, coronary artery disease, cancer, and sleep apnea.2 It should also be noted that the consequences of obesity extend beyond physical ailment and into the psychosocial as well as economic aspects of life.3 The most commonly used anthropometric method to diagnose obesity is the body mass index (BMI), which is calculated as an individual’s weight in kilograms divided by the height in meters squared. This was first described in the 19th century by a Belgian mathematician who noticed that in people he considered to be ‘normal frame’, the weight was proportional to the height squared.4

Diagnostic performance of body mass index DO Okorodudu et al

792 BMI has been used extensively in epidemiological studies and incorporated into clinical practice because of its simplicity. A major shortcoming of BMI arises in that the numerator (weight) of the index fails to distinguish between lean and fat mass.5–7 Conversely, techniques that accurately measure body fat such as dual energy X-ray absorptiometry, hydrostatic weighing, air-displacement plethysmography, isotope dilution, and bioelectrical impedance analysis are rarely used in clinical practice. Over the last few decades, several studies have analyzed the performance of BMI to detect body adiposity when compared with techniques known to accurately measure body composition. Indices of diagnostic performance used in such studies include sensitivity, defined as the probability that a person who actually has the condition of interest will have a positive test result; specificity, defined as the probability that a person who does not have the condition of interest will have a negative test result; and likelihood ratio (LR), which expresses the odds that the test result occurs in individuals with the condition versus the odds that the test result occurs in individuals without the condition. The results of these studies have been diverse, some showing a good diagnostic performance and others showing a poor sensitivity of BMI to detect high levels of adiposity. Other studies have suggested that the failure of many epidemiological studies to show a higher risk for adverse events in overweight people (BMI 25–29 kg m–2) when compared with normal weight individuals can be explained by the limited ability of BMI to differentiate body fat from lean mass in different populations.8,9 Furthermore, excess body fat percent (BF%) has been shown to be associated with metabolic dysregulation regardless of body weight.10 Thus, it is imperative to know the accuracy of BMI to identify body adiposity to justify its use in clinical practice to either diagnose or rule-out excessive body adiposity at the individual patient level. We performed a systematic review and meta-analysis to calculate the pooled sensitivity and specificity of BMI to identify excessive body adiposity.

Materials and methods Selection criteria and search strategy The predefined inclusion criteria were (1) the study must have assessed the performance of BMI to identify excess body fat; (2) provided standard values of diagnostic performance (for example sensitivity, specificity, positive predictive value, negative predictive value); and (3) used a body composition technique (for example dual energy X-ray absorptiometry, air-displacement plethysmography, hydrostatic weighing) as the gold standard. We searched the databases MEDLINE (1950 to June 2008), EMBASE (1988 to June 2008), Cochrane, Database of Systematic Reviews (from inception), Cochrane CENTRAL International Journal of Obesity

(from inception), Web of Science (1993 to June 2008), and SCOPUS (1996 to June 2008). The search was conducted at the Mayo Clinic Plummer Library in Rochester, MN, by a librarian with expertise in systematic reviews, and was designed to find all studies that assessed the ability of BMI to detect excess adiposity as determined by BF%. In conducting the search, three domains were specified as absolute criteria: (1) BMI or equivalent (for example BMI, Quetelet Index); (2) diagnostic performance or equivalent (for example sensitivity, specificity, predictive values); and (3) body fat or equivalent (for example dual energy X-ray absorptiometry, bioelectrical impedance, air-displacement plethysmography, body composition). The results from the individual domain searches were combined using the ‘and’ conjugation or its equivalent. On the basis of the information provided in the title and abstract alone, we eliminated irrelevant articles yielded from our primary search (Figure 1). The remaining studies were then read in their entirety by a single investigator and those that did not meet the inclusion criteria were excluded. Furthermore, our search was supplemented with cross-references from the selected articles as well as through correspondence with researchers. A particular effort was made to contact authors of articles with equivocal information that seemed incomplete to be included in this systematic review and meta-analysis looking for additional data. We also contacted investigators known to do research on BMI diagnostic performance.

Quality assessment/data abstraction At random, 10 of the studies that were not excluded from our primary search based solely on title and abstract were independently reviewed for inclusion by two investigators (DO and ARC) and the agreement coefficient was determined. The quality of studies eligible for review was assessed based on a 6- to 16-point scale considering factors that determine the validity of studies specifically assessing body composition and factors related to validity of diagnostic tests performance. The criteria used to evaluate the quality of the study included (1) standardization and accuracy of height measurement, (2) standardization and accuracy of weight measurement, (3) gold standard used to assess body fat, (4) time between BMI and BF% measurement, (5) blinding of BMI measurement from BF% measurement, and (6) instructions given to subjects regarding diet and exercise before measurements of body composition. Studies were classified as of excellent quality (15–16 points), good quality (12–14 points), fair quality (9–11 points), or low quality (6–8 points). Data was abstracted from articles by a single investigator, who gathered information related to the population studied, the gold standard used to measure BF%, the BF% cutoff values used to define overweight or obesity, the BMI cutoff values for overweight or obesity, and values of diagnostic performance of BMI to detect high BF%.

Diagnostic performance of body mass index DO Okorodudu et al

793 3,341 potentially relevant abstract 3,225 articles did not meet inclusion criteria based on information in titles and abstracts

116 full articles further reviewed for inclusion

Reason for Exclusion • Reported only correlation 37 • Assessed prediction equations 19 • Only non-healthy subjects studied 2 • Evaluated accuracy of self reported BMI 1 • Assessed diagnostic performance of BMI to diagnose non-obese individuals 1

56 articles met inclusion criteria 5 articles included via cross-reference 9 articles not written in English 27 articles not focused on adults 25 articles included in systematic review Figure 1 Flowchart illustration of study selection. Out of 3341 potentially relevant abstracts, 20 articles were included in the meta-analysis, along with five articles included through cross-reference.

Data analysis The primary outcome for analysis was the performance of BMI to identify excess body fat compared with the gold standard measuring BF%. Sensitivity, specificity, and LRs were either collected or calculated using information provided in the original publications. The heterogeneity of diagnostic test parameters was evaluated initially by graphic examination of Forrest plots for each parameter. Statistical assessment was then performed using the inconsistency statistic (I2). The I2 statistic is defined as the percentage of variation across studies as a result of heterogeneity beyond that from chance.11 A value of 0% indicates no observed heterogeneity, whereas values 450% representing the possibility of substantial heterogeneity. Pooled summary statistics for sensitivities, specificities, LRs, and diagnostic odds ratios (DORs) of the individual studies were then reported. The DOR, computed as the positive likelihood ratio (LR þ ) over negative likelihood ratio (LR ), is defined as the odds of having a positive test result in patients with disease compared with the odds of a positive test result in patients without disease.12 Owing to a priori assumptions about the likelihood for heterogeneity between primary studies, the random-effects model of DerSimonian and Laird was used for pooled analysis.13 Predefined subgroup analyses were performed with the following potential causes of between-study heterogeneity: (1) BMI cutoff values to define obesity, (2) BF% cutoff values

to define obesity, (3) gold standard used to assess BF%, (4) regional origin of the studies, and (5) quality assessment score. Studies were grouped into one of three subgroups based on BMI cutoff value used to define obesity: p24.9 kg m–2, from 25 to o30 kg m–2, or X30 kg m–2. Studies were regrouped based on their definition of obesity according to BF% into (1) BF% o30 in females and o25 in males, (2) BF% equal to 30 in females and 25 in males, or (3) BF% 430 in females irrespective of BF% in males. Studies included in the meta-analysis used different methods as the gold standard to assess for BF composition. Owing to their comparable accuracy, studies that used dual energy Xray absorptiometry, hydrostatic weighing, air-displacement plethysmography, and isotope dilution measurement of total body water were grouped together and the pooled estimates were reported and compared with those studies that used lower accuracy measures as their gold standard (bioelectrical impedance and skin fold).Owing to the reported ethnic and geographic differences on body composition,14–16 studies were grouped based on their regional origin into either from North America, South-East Asia, or Europe. Finally, studies were grouped into two subgroups based on their quality assessment score described above. For studies that did not report the prevalence of obesity according to their gold standard used, we ascribed the national prevalence of obesity in the United States derived from the Third National Health and Nutrition Examination International Journal of Obesity

Diagnostic performance of body mass index DO Okorodudu et al

794 Survey,1 and then conducted sensitivity analysis using a lower and a higher prevalence and assessed for the effect on overall pooled parameters. Analyses were preformed using version 1.4 of the statistical software Meta-DiSc.13

Results The search strategy yielded 3341 potentially relevant abstracts (Figure 1). Subsequently, 25 articles that met all our inclusion criteria were included for systematic review and meta-analysis.17–41 Interobserver agreement using the Kappa statistics regarding the selection of articles was 0.90. The 25 studies evaluated 32 different samples and a total of 31 968 adults. The studies were published between 1990 and October 2008. Study and population characteristics are summarized in Table 1. BMI shows a pooled sensitivity to identify excess body adiposity of 0.50 (95% confidence interval (CI): 0.43–0.57) and a pooled specificity of 0.90 (CI: 0.86–0.94) (Figure 2). The positive LR was 5.88 (CI: 4.24–8.15), I2 ¼ 97.8%; the negative LR was 0.43 (CI: 0.37–0.50), I2 ¼ 98.5%; and the DOR was 17.91 (CI: 12.56–25.53), I2 ¼ 91.7% (Figures 3 and 4). Graphic examination of Forrest plots of the different parameters (sensitivity, specificity, LRs, and DOR) revealed considerable heterogeneity among the studies (Figures 3 and 4). As anticipated, this was also shown by the high inconsistency index values for pooled estimates. Potential causes for this considerable between-study heterogeneity were explored through subgroup analyses. The results of these analyses are presented in Table 2. The BF% cutoff values and the regional origin of the studies can partially explain the heterogeneity seen in the pooled DOR estimates (Table 2).

Discussion This meta-analysis assessed the diagnostic performance of BMI in 31 968 individuals from 32 research studies from 12 different countries. The results show that the performance of BMI to identify excessive adiposity has a good specificity, but poor sensitivity. Pooled results from the 32 studies showed a sensitivity of around 50%, suggesting that many individuals not labeled as obese might indeed have excess adiposity. These results have several implications. The low sensitivity using the current BMI cutoff values indicates that we are underdiagnosing excess adiposity in many individuals. As the first step of dealing with a risk factor is an accurate identification of the pathophysiological problem, not diagnosing obesity in individuals with excess adiposity represents a missed opportunity for initiating a lifestyle change in people at risk. Recent studies have shown that the International Journal of Obesity

amount of adipose tissue in subjects with normal BMI provides incremental prognostic value, particularly in women.10 The results of this meta-analysis suggest that the current definition of obesity at the individual level needs to be reassessed. Although obesity means excess body fat, the current definition of obesity is based on body weight regardless of its composition. Many years after its original description, the predictive value of BMI was shown and validated in multiple epidemiological studies showing that a high BMI value was associated with increased mortality. Further studies showed high correlation coefficients between body fat and BMI. These factors plus the simplicity of obtaining BMI resulted in the widespread use and acceptance of this index to diagnose obesity and to identify subjects at risk for obesity-related comorbidities. However, the results of this study suggest that BMI has its own limitations to diagnose excess adiposity at the individual-person level, particularly when BMI values are below 30. The inability of BMI to distinguish fat from lean mass can lead to the inappropriate diagnosis of obesity. This shortcoming has been shown in many studies including that of Ode et al.38 in which the specificity of BMI to diagnose excess adiposity in varsity male athletes was only 27%, whereas the sensitivity was excellent at 100%. This study’s results showed a pooled specificity of 90% in diagnosing excess adiposity, thereby indicating that 10% of the studied individuals were misdiagnosed. Ongoing studies assessing the prognostic value of different methods of body composition will help to elucidate whether simple techniques such as air-displacement plethysmography that are capable of better distinguishing fat from lean mass may replace BMI in the clinical evaluation of obesity. Furthermore, the results of this meta-analysis may also explain why BMI cannot discriminate cardiovascular risk very well in people with intermediate BMI values. Multiple studies have shown that a log-linear association between BMI and ischemic heart disease and mortality were risk increments in intermediate BMI values and are very small, but increase significantly when BMI is higher than 30 or 35.42 Studies assessing correlation between BMI and body fat have shown that people with intermediate BMI values represent a heterogenous group regarding body fat content, some with preserved lean mass and little muscle mass, whereas others have high body fat and limited lean mass or so-called ‘normal weight obese’.43 These latter individuals have shown to have more metabolic dysregulation than those with normal weight and low fat content.10,43 In addition, measures of fat distribution have shown to discriminate CV risk very well in individuals with intermediate BMI values, suggesting that BMI does not account for all the adiposity-related risk in those individuals. A similar log-linear association has been observed in total cholesterol and cardiovascular mortality. Very high total cholesterol values provide a disproportionally high risk when compared with the risk related to minor cholesterol

1994

1996 1996 1997

1998 2000

2000 2001

2001

2001

2002 2002

2002 2003

2004

2006 2006

2006 2006

2006 2006 2006 2006 2006 2007

2007

2007 2007

2007

2008

Hortobagyi T et al.

Wellens RI Wellens RI Curtin F et al.

Taylor R et al. Piers LS et al.

Sardinha LB et al. Dudeja V et al.

Frankenfield D et al.

Ko GT

Blew RM et al. Yao M et al.

Yamagishi H et al. DeLorenzo A et al.

Goh V et al.

Evans EM et al. Jackson A et al.

Kagawa M et al. Pongchaiyakul et al.

Yang F et al. Yang F et al. Yang F et al. Yang F et al. Chen Y-M et al. Ode J et al.

Ode J et al.

Ode J et al. De Freitas et al..

De Freitas et al.

Romero-Corral et al.

Good

Good

Good Good

Good

Good Good Good Good Good Good

Good Good

Good N/A

Fair

Fair Fair

Good Good

Fair

Good

Excellent Fair

Fair Good

Fair Fair Good

Fair

Fair Good

US

Brazil

US Brazil

US

China China China China China US

Japan Thailand

US US

Singapore

Japan Italy

US China

China

US

Portugal India

New Zealand Australia

US US Switzerland

US

US US

Location

13601

479

28 206

213

887 222 694 185 1122 198

139 847

296 419

1069

605 890

317 71

5153

141

383 123

96 117

504 511 226

1645

218 363

N

43.48

46.00

N/A 42.70

N/A

N/A N/A N/A N/A 53.00 N/A

20.40 50.00

65.86 22.87

48.68

19.60 44.35

54.20 42.94

51.50

39.38

60.50 29.91

N/A 36.00

31.73 30.35 38.10

39.11

36.60 34.60

Mean age

0.19

15.2

N/A 15.4

N/A

N/A N/A N/A N/A 4.00 N/A

1.30 16.00

7.46 4.33

0.35

0.50 14.99

4.50 0.65

16.30

1.70

7.10 12.20

N/A 18.00

7.74 8.13 16.90

10.28

8.90 12.00

Age s.d./ s.e.m.+

49

0

100 100

36.62

100 100 0.00 0.00 0.00 61.11

0.00 40.14

0.00 33.65

27.88

0 33.03

0.00 46.48

27.54

37.59

0.00 69.92

0.00 43.59

100 0.00 31.86

77.81

0.00 41.32

% Male

51

100

0.00 0

63.38

0.00 0.00 100 100 100 38.89

100 59.86

100 66.35

72.12

100 66.97

100 53.52

72.46

62.41

100 30.08

100 56.41

0.00 100 68.14

22.19

100 58.68

% Female

BIA

BIA

ADP BIA

ADP

DXA DXA DXA DXA DXA ADP

DXA DXA

DXA DXA

DXA

HW DXA

DXA ID

BIA

BIA

DXA SF

DXA ID

HW HW DXA

HW

HW HW

Gold standard

45% BF 20% BF in males, 25% BF in females 25% BF in males, 30% BF in females 25% BF 33% BF 20% BF in males, 25% BF in females 43% BF 25% BF in males, 30% BF in females 35% BF 25% BF in males, 30% BF in females 25% BF in males, 30% BF in females 25% BF in males, 35% BF in females 35% BF 24% BF in males, 35% BF in females 29.80% 25% BF in males, 35% BF in females 25% BF in males, 35% BF in females 38% BF 25% BF in males, 33% BF in females 30% BF 25% BF in males, 35% BF in females 25% BF 25% BF 30% BF 30% BF 37% BF 20% BF in males, 33% BF in females 20% BF in males, 33% BF in females 25% BF 38% BF in 20–30 years old, 39% BF in 40–59 years old, 41% BF in 60–79 years old 26% BF in 20–30 years old, 27% BF in 40–59 years old, 29% BF in 60–79 years old 25% BF in males, 35% BF in females

Diagnostic criteria according to gold standard

83.00 47.70

25.00 N/A 89.30

30 kg m–2

N/A N/A N/A N/A 54.35 9.61 29.76

23.5 kg m–2 24.1 kg m–2 19.6 kg m–2 21.2 kg m–2 28 kg m–2 25 kg m–2 25 kg m–2

N/A 48.18

30 kg m–2

N/A

27.5 kg m–2

26.3 kg m–2

30 kg m

65.00

N/A 31.19

25 kg m–2 30 kg m–2

–2

11.04 47.46 46.94

18.40 62.70

25 kg m–2 30 kg m–2 30 kg m–2 30 kg m–2 30 kg m–2

87.20 65.90

N/A N/A

43.20

90.30

90.50

94.00

68.62

85.30 85.20 82.80 86.30 28.00 100

8.90 18.56

12.44 66.70 50.27

29.80 40.32

90.31

34.08 48.65 26.90

67.43 78.72

26.80

23.8 kg m–2 in males, 24.2 kg m–2 in females 24 kg m–2 25 kg m–2

25 kg m 30 kg m–2

–2

48.61 78.60 81.50 13.30

55.00 50.60

N/A N/A

N/A N/A 46.63

Total sensitivity %

41.34

Prevalence according to BF% gold standard

45 kg m–2 27.8 kg m–2 in males, 27.3 kg m–2 in females 28 kg m–2 in males, 27 kg m–2 in females 25 kg m–2 23 kg m–2 27.8 kg m–2 in males, 27.3 kg m–2 in females 27.3 kg m–2 25 kg m–2

Diagnostic criteria according to BMI

96.88

82.60

86.60

10.00

80.30

84.70 75.90 56.90 65.70 94.00 42.27

98.90 98.98

95.68 86.00 95.29

96.67 98.49

85.70 90.00

83.40

96.67 100

100

94.00 86.30

69.80 84.40 100

93.26

82.00 95.10

Total specificity %

Abbreviations: ADP, air-displacement plethysmography; BF, body fat; BIA, bioelectrical impedance analysis; BMI, body mass index; DXA, dual energy X-ray absorptiometry; HW, hydrostatic weighing; ID, isotope dilution; N/A, not available; SF, skin fold. 25 articles yielding 32 studies are summarized in this table. Sensitivity results ranged from 8.9% in a female Japanese population (Kagawa et al., 2006) to 100% in a US population of college athletes (Ode et al., 2007). Specificity ranged from 42% (Ode et al., 2007) to a value of 100% (Curtin et al., 1997; Sardinha et al., 2000; Frankenfield et al., 2001; Ode et al., 2007).

1990 1990

Year of article/ Quality assessabstract ment score

Characteristics and results of individual studies

Israel R et al. Smalley K et al.

Author

Table 1

Diagnostic performance of body mass index DO Okorodudu et al

795

International Journal of Obesity

Diagnostic performance of body mass index DO Okorodudu et al

796

Figure 2 Pooled sensitivity and specificity of BMI. BMI has a sensitivity of 50% and a specificity of 90%. Considerable heterogeneity is noticed by graphic examination of the Forrest plots, particularly with regards to pooled sensitivity.

Figure 3 Positive and negative LRs of BMI. BMI has an LR þ of 5.88 and an LR of 0.43.

International Journal of Obesity

Diagnostic performance of body mass index DO Okorodudu et al

797

Figure 4 Diagnostic odds ratio of BMI. DOR, defined as LR þ /LR, is 17.91. Table 2

Diagnostic Performance of BMI across different subgroups

Subgroup

Definition

N

BMI cutoff values

o25 kg m–2 X25, but o30 kg m–2 X30 kg m–2 o25%, o30% ¼ 25%, 30% X30% in females DXA, HW, ID, ADP BIA, SF North America South-East Asia Europe Low, fair Good, excellent

7 15 10 4 12 16 25 7 13 11 4 29 3

BF cutoff values

Gold standard Regional origin

Quality score

LR+(CI) 3.99 5.57 9.68 7.77 3.80 7.46 5.06 9.69 5.77 4.65 20.27 5.11 19.46

(2.62–6.08) (3.55–8.75) (3.30–28.36) (4.72–12.79) (2.30–6.27) (4.19–13.26) (3.61–7.09) (5.17–18.16) (2.79–11.94) (3.17–6.82) (11.04–37.19) (3.88–6.73) (2.93–129.25)

I2 97.6% 95.3% 98.5% 65.4% 97.8% 98.2% 96.4% 97.8% 99% 96.3% 0% 96.4% 98.5%

LR(CI) 0.19 0.47 0.62 0.46 0.48 0.41 0.48 0.32 0.44 0.41 0.66 0.39 0.63

(0.13–0.27) (0.38–0.59) (0.54–0.72) (0.25–0.87) (0.40–0.58) (0.31–0.55) (0.41- 0.56) (0.23–0.46) (0.38–0.50) (0.24–0.69) (0.51–0.86) (0.3–0.5) (0.51–0.79)

I2 82.7% 97.0% 98% 98.7% 94.6% 99.2% 97.2% 99.3% 95.8% 99.6% 96.3% 98.9% 98.9%

DOR (CI) 21.69 15.28 17.88 22.94 10.96 21.69 13.98 37.93 18.18 12.98 48.86 16.87 31.44

(10.78- 43–63) (10.27–22.72) (7.48–42.76) (11.44–46.0) (8.32–14.43) (12.11–38.84) (10.38–18.83) (15.75–91.32) (10.71–30.86) (6.33–26.62 (23.96–99.63) (11.8–24.12) (4.14–238.8)

I2 92.3% 74.8% 95.2% 65.3% 55.9% 93.2% 75.1% 97.1% 92.9% 94.2% 0% 88.1% 98.6%

Abbreviations: ADP, air-displacement plethysmography; BIA, bioelectrical impedance analysis; CI, confidence interval; DOR, diagnostic odds ratio; DXA, dual energy X-ray absorptiometry; HW, hydrostatic weighing; ID, isotope dilution; LR, likelihood ratio; N/A, not available; SF, skin fold.

elevations.42 This phenomenon is explained because very high values of total cholesterol generally reflect high very low-density lipoprotein and low-density lipoprotein

content, whereas intermediate values of total cholesterol do not discriminate well between lipoproteins associated with higher versus lower cardiovascular risk. International Journal of Obesity

Diagnostic performance of body mass index DO Okorodudu et al

798 Our results also suggest that if the obesity should be defined based on the amount of body fat, then the optimal cutoff level for BMI with the maximum diagnostic performance will fall between 25 and 30. As any diagnostic test is reported as a continuous variable, there is a tradeoff between sensitivity and specificity depending on where the cutoff value falls, with higher specificity with high cutoff values and higher sensitivity when lower cutoff values are used. Our study has several limitations. As with any metaanalysis, there is a risk for publication bias in which positive results or results with ‘expected’ findings are more likely to be published. We made every possible effort to minimize this type of bias by contacting investigators in the field of BMI or people who were known to be working on body fat measurement. If editors were more likely to publish manuscripts showing the ‘expected’ results of a good diagnostic performance for BMI, then our results may be overestimating the real diagnostic performance of BMI. Our study also showed significant heterogeneity. Inconsistency indices yielded substantial values for pooled LRs as well as for pooled DOR. Sources of heterogeneity included BMI cutoff values used to define obesity, BF% cutoff values to define obesity, gold standard used to assess body fat, and regional origin of the studies. However, it is important to note that though regional origin did contribute to heterogeneity, subgroup analysis shows that there is still significant amount of inconsistency between results even within specific regions. It is possible that publication bias contributed to the heterogeneity, should editors be prone to accept studies with extreme performance, that is those showing either outstanding diagnostic performance or those showing very poor performance. Another major limitation was the use of different gold standards for the definition of excess adiposity. It is clear that some techniques to measure body composition are more reliable than others; therefore, our pooled results already reflect some of the inherent measurement error with techniques that are known to be suboptimal to measure BF%. However, our subgroup analysis did not show major differences in the pooled estimates when we limited the analysis to the most valid techniques to measure body fat. For this reason, we do not think this limitation will invalidate our results. The inclusion of studies performed in different geographic areas is a strength because it increases generalizability, but it also becomes a limitation because of the fact that body composition techniques have not been well validated in non-Caucasian populations. Although our study illustrates some of the limitations in using BMI for the diagnosis of excess adiposity, it is important to stress that the use of BMI is of significant value. Our results confirm that when BMI is X30 kg m–2, it has a near perfect specificity and an excellent predictive value to detect excess adiposity in both sexes. In additon, BMI or even plain body weight is most likely the best way to evaluate changes in body adiposity over time because changes in body weight most likely represent an increase International Journal of Obesity

in the volume of adipose tissue, with the exception of body builders or patients with conditions that increase the third space volume such as renal or liver failure. In conclusion, this study shows that the use of BMI to identify excess body adiposity at the individual patient level has good specificity, but poor sensitivity, with approximately half of individuals who have excessive BF% being labeled as non-obese. As excess BF% has been associated with metabolic dysregulation regardless of body weight, BMI should not be considered as the only measure of obesity in patient care settings, particularly in those with BMI o30 kg m–2.

Conflict of interest The authors declare no conflict of interest.

Acknowledgements Dr Somers is supported by NIH grants HL-65176, HL-70302, HL-73211, and M01RR00585. Dr Lopez-Jimenez was the recipient of a Clinical Scientist Development Award from the American Heart Association at the time of performing this study. Dr Somers, Dr Lopez-Jimenez, and Dr Romero-Corral are recipients of an unrestricted grant from Select Research to assess the clinical value of assessing regional body volumes.

References 1 Ogden CL, Carroll MD, Curtin LR, McDowell MA, Tabak CJ, Flegal KM. Prevalence of overweight and obesity in the United States, 1999–2004. JAMA 2006; 295: 1549–1555. 2 Poirier P, Giles TD, Bray GA, Hong Y, Stern JS, Pi-Sunyer FX et al. Obesity and cardiovascular disease: pathophysiology, evaluation, and effect of weight loss: an update of the 1997 American heart association scientific statement on obesity and heart disease from the obesity committee of the council on nutrition, physical activity, and metabolism. Circulation 2006; 113: 898–918. 3 Colditz GA. Economic costs of obesity. Am J Clin Nutr 1992; 55: 503S–507S. 4 Quetelet A. A Treatise on a Man and the Development of His Faculties. Originally published in 1842 Reprinted by Burt Franklin: New York, 1968. 5 Behnke A, Wilmore J. Evaluation and Regulation of Body Build and Composition. Englewood Cliffs NJ: Prentice Hall, 1974. 6 Wellens RI, Roche AF, Khamis HJ, Jackson AS, Pollock ML, Siervogel RM. Relationships between the body mass index and body composition. Obes Res 1996; 4: 35–44. 7 Kontogianni MD, Panagiotakos DB, Skopouli FN. Does body mass index reflect adequately the body fat content in perimenopausal women? Maturitas 2005; 51: 307–313. 8 Romero-Corral A, Montori VM, Somers VK, Korinek J, Thomas RJ, Allison TG et al. Association of bodyweight with total mortality and with cardiovascular events in coronary artery disease: a systematic review of cohort studies. Lancet 2006; 368: 666–678. 9 Lopez-Jimenez F. Speakable and unspeakable facts about BMI and mortality. Lancet 2009; 373: 1055–1056. 10 Romero-Corral A, Somers VK, Sierra-Johnson J, Korenfeld Y, Boarin S, Korinek J et al. Normal weight obesity: a risk factor for

Diagnostic performance of body mass index DO Okorodudu et al

799 11 12

13

14

15

16

17

18 19

20

21

22

23

24

25

26

27

28

cardiometabolic dysregulation and cardiovascular mortality. Eur Heart J 2009; e-pub ahead of print 20 November 2009. Higgins JP, Thompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analyses. BMJ 2003; 327: 557–560. Glas AS, Lijmer JG, Prins MH, Bonsel GJ, Bossuyt PM. The diagnostic odds ratio: a single indicator of test performance. J Clin Epidemiol 2003; 56: 1129–1135. Zamora J, Abraira V, Muriel A, Khan K, Coomarasamy A. MetaDiSc: a software for meta-analysis of test accuracy data. BMC Med Res Methodol 2006; 6: 31. Carroll JF, Chiapa AL, Rodriquez M, Phelps DR, Cardarelli KM, Vishwanatha JK et al. Visceral fat, waist circumference, and BMI: impact of race/ethnicity. Obesity (Silver Spring) 2008; 16: 600–607. Araneta MR, Barrett-Connor E. Ethnic differences in visceral adipose tissue and type 2 diabetes: Filipino, African-American, and white women. Obes Res 2005; 13: 1458–1465. Hill JO, Sidney S, Lewis CE, Tolan K, Scherzinger AL, Stamm ER. Racial differences in amounts of visceral adipose tissue in young adults: the CARDIA (Coronary Artery Risk Development in Young Adults) study. Am J Clin Nutr 1999; 69: 381–387. Israel RG, Pories WJ, O’Brien KF, McCammon MR. Sensitivity and specificity of current methods for classifying morbid obesity. Diabetes Res Clin Pract 1990; 10 (Suppl 1): S145–S147. Smalley KJ, Knerr AN, Kendrick ZV, Colliver JA, Owen OE. Reassessment of body mass indices. Am J Clin Nutr 1990; 52: 405–408. Hortobagyi T, Israel RG, O’Brien KF. Sensitivity and specificity of the Quetelet index to assess obesity in men and women. Eur J Clin Nutr 1994; 48: 369–375. Curtin F, Morabia A, Pichard C, Slosman DO. Body mass index compared to dual-energy X-ray absorptiometry: evidence for a spectrum bias. J Clin Epidemiol 1997; 50: 837–843. Taylor RW, Keil D, Gold EJ, Williams SM, Goulding A. Body mass index, waist girth, and waist-to-hip ratio as indexes of total and regional adiposity in women: evaluation using receiver operating characteristic curves. Am J Clin Nutr 1998; 67: 44–49. Piers LS, Soares MJ, Frandsen SL, O’Dea K. Indirect estimates of body composition are useful for groups but unreliable in individuals. Int J Obes Relat Metab Disord 2000; 24: 1145–1152. Sardinha LB, Teixeira PJ. Obesity screening in older women with the body mass index: a receiver operating characteristic (ROC) analysis. Sci Sports 2000; 15: 212–219. Dudeja V, Misra A, Pandey RM, Devina G, Kumar G, Vikram NK. BMI does not accurately predict overweight in Asian Indians in northern India. Br J Nutr 2001; 86: 105–112. Frankenfield DC, Rowe WA, Cooney RN, Smith JS, Becker D. Limits of body mass index to detect obesity and predict body composition. Nutrition 2001; 17: 26–30. Ko GT, Tang J, Chan JC, Sung R, Wu MM, Wai HP et al. Lower BMI cut-off value to define obesity in Hong Kong Chinese: an analysis based on body fat assessment by bioelectrical impedance. Br J Nutr 2001; 85: 239–242. Blew RM, Sardinha LB, Milliken LA, Teixeira PJ, Going SB, Ferreira DL et al. Assessing the validity of body mass index standards in early postmenopausal women. Obes Res 2002; 10: 799–808. Yao M, Roberts SB, Ma G, Pan H, McCrory MA. Field methods for body composition assessment are valid in healthy chinese adults. J Nutr 2002; 132: 310–317.

29 Yamagishi H, Kitano T, Kuchiki T, Okazaki H, Shibata S. Association between body composition and body mass index in young Japanese women. J Nutr Sci Vitaminol (Tokyo) 2002; 48: 201–206. 30 De Lorenzo A, Deurenberg P, Pietrantuono M, Di Daniele N, Cervelli V, Andreoli A. How fat is obese? Acta Diabetol 2003; 40 (Suppl 1): S254–S257. 31 Goh VH, Tain CF, Tong TY, Mok HP, Wong MT. Are BMI and other anthropometric measures appropriate as indices for obesity? A study in an Asian population. J Lipid Res 2004; 45: 1892–1898. 32 Evans EM, Rowe DA, Racette SB, Ross KM, McAuley E. Is the current BMI obesity classification appropriate for black and white postmenopausal women? Int J Obes (Lond) 2006; 30: 837–843. 33 Jackson A, Ellis K, Mcfarlin B, Sailors M, Turpin I, Bray M. Accuracy of BMI to detect percent fat obesity in men and women, ages 17–39: the TIGER Study. Med Sci Sports Exerc 2006; 38: S311. 34 Kagawa M, Uenishi K, Kuroiwa C, Mori M, Binns CW. Is the BMI cut-off level for Japanese females for obesity set too high? A consideration from a body composition perspective. Asia Pac J Clin Nutr 2006; 15: 502–507. 35 Pongchaiyakul C, Nguyen TV, Kosulwat V, Rojroongwasinkul N, Charoenkiatkul S, Pongchaiyakul C et al. Defining obesity by body mass index in the Thai population: an epidemiologic study. Asia Pac J Clin Nutr 2006; 15: 293–299. 36 Yang F, Lv JH, Lei SF, Chen XD, Liu MY, Jian WX et al. Receiveroperating characteristic analyses of body mass index, waist circumference and waist-to-hip ratio for obesity: screening in young adults in central south of China. Clin Nutr 2006; 25: 1030–1039. 37 Chen YM, Ho SC, Lam SS, Chan SS. Validity of body mass index and waist circumference in the classification of obesity as compared to percent body fat in Chinese middle-aged women. Int J Obes (Lond) 2006; 30: 918–925. 38 Ode JJ, Pivarnik JM, Reeves MJ, Knous JL. Body mass index as a predictor of percent fat in college athletes and nonathletes. Med Sci Sports Exerc 2007; 39: 403–409. 39 De Freitas SN, Caiaffa WT, Cesar CC, Candido APC, Faria VA, Neto RMND et al. A comparative study of methods for diagnosis of obesity in an urban mixed-race population in Minas Gerais, Brazil. Public Health Nutr 2007; 10: 883–890. 40 Pischon T, Boeing H, Hoffmann K, Bergmann M, Schulze MB, Overvad K et al. General and abdominal adiposity and risk of death in Europe. N Engl J Med 2008; 359: 2105–2120. 41 Romero-Corral A, Somers VK, Sierra-Johnson J, Thomas RJ, Collazo-Clavell ML, Korinek J et al. Accuracy of body mass index in diagnosing obesity in the adult general population. Int J Obes (Lond) 2008; 32: 959–966. 42 Prospective Studies Collaboration, Lewington S, Whitlock G, Clarke R, Sherliker P, Emberson J et al. Blood cholesterol and vascular mortality by age, sex, and blood pressure: a metaanalysis of individual data from 61 prospective studies with 55,000 vascular deaths. Lancet 2007; 370: 1829–1839. 43 De Lorenzo A, Del Gobbo V, Premrov MG, Bigioni M, Galvano F, Di Renzo L. Normal-weight obese syndrome: early inflammation? Am J Clin Nutr 2007; 85: 40–45.

International Journal of Obesity