Diagnostic Accuracy of Sonoelastography in ...

2 downloads 0 Views 2MB Size Report
nign from malignant thyroid nodules. Diagnostic Accuracy of. Sonoelastography in Detecting. Malignant Thyroid Nodules: A Systematic Review and. Meta- ...
Neuroradiolog y/Head and Neck Imaging • Original Research Ghajarzadeh et al. Sonoelastography of Thyroid Nodules

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

Neuroradiology/Head and Neck Imaging Original Research

Diagnostic Accuracy of Sonoelastography in Detecting Malignant Thyroid Nodules: A Systematic Review and Meta-Analysis Mahsa Ghajarzadeh1 Faezeh Sodagari2,3 Madjid Shakiba2,3 Ghajarzadeh M, Sodagari F, Shakiba M

Keywords: elastography, thyroid nodules, ultrasound DOI:10.2214/AJR.12.9785 Received August 11, 2012; accepted after revision August 7, 2013. 1 Brain and Spinal Injury Research Center, Tehran University of Medical Sciences, Tehran, Iran.  2 Medical Imaging Center, Imam Khomeini Hospital, Keshavarz Blvd, Tehran 149733141, Iran. Address correspondence to F. Sodagari ([email protected]). 3 Advanced Diagnostic and Interventional Radiology Research Center, Tehran University of Medical Sciences, Tehran, Iran. 

WEB This is a web exclusive article. AJR 2014; 202:W379–W389 0361–803X/14/2024–W379 © American Roentgen Ray Society

OBJECTIVE. The aim of this systematic review was to determine the diagnostic accuracy of sonoelastography in detecting malignant thyroid nodules. MATERIALS AND METHODS. A systematic search in MEDLINE and bibliographic databases was performed for the terms “thyroid nodule” and “sonoelastography.” The inclusion criteria were the report of a 4- or 5-point scoring scale for elasticity score by qualitative sonoelastography as the index test and fine-needle aspiration (FNA) cytology or histopathology for thyroid nodules as the reference standard. Studies in which only the strain ratio was reported and studies of patients with underlying medical conditions were excluded. The methodologic quality of the studies was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool. A meta-analysis of diagnostic accuracy measures for sonoelastography was performed using Meta-DiSc freeware software (version 1.4). RESULTS. A total of 12 studies assessing 1180 thyroid nodules (817 benign and 363 malignant) were included. The most commonly used threshold for characterizing malignancy—that is, elasticity scores between 2 and 3—showed a sensitivity of 86.0% (95% CI, 81.9–89.4%) and specificity of 66.7% (95% CI, 63.4–69.9%) with positive and negative likelihood ratios and a diagnostic odds ratio of 3.82 (95% CI, 2.38–6.13), 0.16 (95% CI, 0.08–0.32), and 27.51 (95% CI, 9.21–82.18), respectively. The highest sensitivity of the test was achieved by a threshold elasticity score of between 1 and 2 with a sensitivity of 98.3% (95% CI, 96.2–99.5%). CONCLUSION. Sonoelastography can be considered as a reliable screening tool for characterizing thyroid nodules. An elasticity score of 1 is indicative of benign pathology in almost all cases and can be used to exclude many patients from further invasive assessments.

T

hyroid nodules are more prevalent in iodine-deficient areas, and the prevalence of thyroid nodules has dramatically increased in these areas [1]. Most thyroid nodules are benign and fewer than 5% are malignant [2–11]. Although ultrasound is a useful method for nodule diagnosis, it is not very accurate in distinguishing between benign and malignant thyroid nodules [12]. Sonographic patterns such as hypoechogenicity, blurred or spiculated margins, spot microcalcification, and intranodular vascularity are characteristics of malignant nodules, but they yield a wide range of sensitivities (55–95%) and specificities (52–81%) for diagnosis as malignant or benign [13–15]. Cytologic examination of thyroid nodules by fine-needle aspiration (FNA) is another diagnostic method for differentiating benign from malignant nodules, but it is invasive and prone to sampling and analytic errors

[16]. Its sensitivity is reported to be 60–98% and its specificity, 54–90% [17–19]. Sonoelastography is a newly developed ultrasound method that is based on the degree of tissue distortion in response to an external force [20, 21]. This method can estimate tissue stiffness and provides a qualitative assessment of the target tissue [22]. Several original reports have assessed the accuracy of this method for thyroid nodule differentiation. We performed this meta-analysis to assess the performance of sonoelastography for distinguishing between benign and malignant thyroid nodules compared with FNA cytologic or histopathologic examination. Materials and Methods This study is a systematic review and metaanalysis of studies assessing the diagnostic test accuracy of sonoelastography in differentiating benign from malignant thyroid nodules.

AJR:202, April 2014 W379

Ghajarzadeh et al.

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

Identification of Studies A systematic search of MEDLINE (U.S. National Library of Medicine), the Cochrane Library (The Cochrane Collaboration), American College of Physicians Journal Club database, Health Technology Assessment Database (The Cochrane Collaboration), and National Health System (NHS) Economic Evaluation Database (The Cochrane Collaboration) was performed from the inception of each database to September 21, 2011, using a search strategy designed to search through Ovid (Wolters Kluwer). The search strategy was developed using Medical Subject Heading (MeSH) terms and the keywords “thyroid nodule” and “sonoelastography.” We also searched through reference lists of the included studies to identify additional studies. All retrieved citations were exported to EndNote (version X3, Thomson Reuters) and were checked for duplicates.

Criteria for Considering Studies for This Review Types of studies—All studies that assessed the diagnostic accuracy of sonoelastography in differentiating benign from malignant thyroid nodules, with either prospective or retrospective data collection, were eligible for inclusion. The sonoelastography examinations could be performed alone or in combination with other diagnostic modalities. No language limitation was considered. Review articles, editorials, and correspondence to editors that did not report original data were excluded. Articles addressing technical developments or studies conducted on phantom models or ex vivo samples were also excluded. Participants—Studies had to include patients with thyroid nodules detected by other imaging methods or physical examination. Studies pertaining to thyroid nodules in special populations with underlying medical conditions were excluded. There were no age or sex restrictions for study participants. Index test—Thyroid sonoelastography with qualitative assessment of the images (i.e., reporting elasticity scores) was considered as the index test. Only studies that used a 4- or 5-point scale based on the method proposed by Itoh et al. [22] were considered eligible. In the 5-point scale classification, the images obtained by sonoelastography are classified on the basis of color patterns: An elasticity score of 1 indicated elasticity in the whole nodule and the nodule is light green with small amounts of red; 2, elasticity in most of the nodule and the nodule is green with small amounts of red and blue; 3, elasticity in a minor part of the nodule and the nodule is predominantly blue with small amounts of red and green mostly in the periphery; 4, no elasticity in the nodule and the whole nodule is blue; and 5, no elasticity in the nodule or in the

W380

area surrounding the nodule and the whole nodule and the area around its circumference are blue. The 4-point scale is similar to the 5-point scale but reports the data only for the nodules, not the surrounding tissue. Thus, the 4-point scale is only a subclassification of the 5-point scale rather than a different scale. Studies in which the classification method used could be merged to form the 4- or 5-point scale described were considered eligible. The color pattern described in each classification method used in the studies was evaluated by the reviewers to assess whether they could be completely merged into the 4- or 5-point scale mentioned. For each original study with a different classification than the method of Itoh et al. [22], the description of the color pattern in each classification score was evaluated by all the reviewers. The color pattern of each score had to be distinctive and had to be the same as the description of any score in the method described by Itoh and colleagues. If all the color patterns were compatible with the desired scale, the reviewers recategorized the nodules in the corresponding scores in the scale described by Itoh et al. regardless of the original scoring scale used in the primary study. If the categories could not fit in the scale proposed by Itoh et al., the primary study was considered ineligible for inclusion. The threshold for assessing malignancy in the individual studies was not considered as an inclusion criterion. Comparator test—No comparator test was considered for eligibility in this study. Target condition—The index test should be used to differentiate malignant from benign thyroid nodules. Reference standard—The following reference standards were used to define the target condition: FNA for cytologic assessment and histopathologic assessment of the tissue after surgical excision of the thyroid nodules or after thyroidectomy.

Data Collection and Analysis Selection of studies—The literature search was performed by an author. Then two independent reviewers made selections based on titles and abstracts. The full-text articles of the abstracts that were not excluded at this stage were then obtained and reviewed for inclusion by two reviewers. Disagreements at each stage were resolved by consensus. Data extraction and management—The reviewers extracted data from the full-text articles. Two reviewers assessed each article independently. Data on patient characteristics (sample size, sex, mean age, number of nodules), technical aspects of sonoelastography (number of performers and assessors, compression method, classification method),

and elasticity scores for benign and malignant lesions were extracted using a data-extraction form. For extraction of diagnostic accuracy data, the number of nodules with each elasticity score for each pathologic result (benign or malignant) was extracted from the text or the tables. If the primary studies did not report the exact number of the nodules in each category (i.e., elasticity score), the expected frequency of nodules in each cell was calculated using the total number of benign and malignant nodules and reported sensitivity and specificity in each study. Assessment of methodologic quality—The quality of each study was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS) tool [23]. The QUADAS tool is composed of 11 questions that can be answered as yes, no, or unclear [23]. We also used nine additional quality assessment questions proposed by the Cochrane Diagnostic Test Accuracy Working Group [23]. We did not calculate a summary score estimating the overall quality of each study. The assessment of methodologic quality was performed by two independent reviewers who resolved disagreements by discussing the case to reach a consensus. Statistical analysis and data synthesis—Pooled estimates for sensitivity, specificity, positive likelihood ratio (LR), negative LR, and diagnostic odds ratio (OR) with the corresponding 95% CIs were used to examine the accuracy of different elasticity score thresholds for differentiating benign from malignant thyroid nodules. The pooled estimates were derived using the fixed-effects model (Mantel-Haenszel method) if significant heterogeneity was not present. In case of heterogeneity, the random-effects model (DerSimonian-Laird method) was applied. Summary receiver operating characteristic (SROC) curves were constructed using the Moses-Shapiro-Littenberg (inverse variance) method. The area under the curve (AUC) was calculated with the corresponding standard error (SE). A p value < 0.05 was considered as significant. The meta-analysis was performed and the SROC curves were constructed using freeware software (Meta-DiSc; version 1.4; Zamora J, Abraira V, Muriel A, Khan KS, Coomarasamy A). The descriptive statistics calculations were performed using SPSS software (version 17, SPSS). The graphs summarizing methodologic quality and risk of bias were created using Review Manager (RevMan, version 5.2, Cochrane IMS). Investigation of heterogeneity—The Cochran Q test was used to detect heterogeneity among studies; p values < 0.1 indicated the presence of heterogeneity. Inconsistency (I2) was calculated to describe the percentage of variability due to heterogeneity rather than sampling errors. I2 > 50% was considered significant for heterogeneity.

AJR:202, April 2014

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

Sonoelastography of Thyroid Nodules Results Results of the Search The search strategy retrieved 93 articles, 61 of which were excluded by reviewing the titles and abstracts. The flow of studies is presented in Figure 1. The full text of the remaining 32 articles was reviewed, and 12 articles were finally included in this study [1, 12, 16, 24–32]. The reasons for exclusion of the remaining 20 articles at this stage were as follows: Three articles were a systematic review [33], editorial [34], or review article [35]. For one study [36], the full-text article could not be accessed and the required data could not be extracted from the abstract of the article. Therefore, the study was excluded. One study was carried out on ex vivo samples [37] and one on patients with acromegaly [38]. In another study, samples were only malignant nodules [39]. Eleven articles reported strain ratio rather than elasticity score [40– 50], and two studies [51, 52] used a method for classification of thyroid nodules that differed from the method proposed by Itoh et al. [22]. Characteristics of the Included Studies Table 1 presents a summary of the characteristics of the included studies. The included articles, published between 2007 and 2011, assessed a total of 1180 thyroid nodules (817 benign and 363 malignant) in more than 850 patients (age range, 8–93 years). The studies were conducted in European countries (n = 7), China (n = 4), and Japan (n = 1).

Records identified through MEDLINE (U.S. National Library of Medicine) search (n = 93)

Additional records identified through searches of other databases (n = 3)

Records after duplicates had been removed (n = 93)

Records screened by reviewing the titles and abstracts (n = 93)

Records excluded (n = 61) Articles excluded after review of the full text (n = 20) for the following reasons: • Elasticity scores were not reported (n = 11)

Full-text articles assessed for eligibility (n = 32)

• Article was a systematic review, editorial, or review article (n = 3) • Inclusion criteria were not met (n = 3) • Incomplete report of data (n = 1)

Studies included in meta-analysis (n = 12)

• Different classification methods were used to classify thyroid nodules (n = 2)

Fig. 1—Flow diagram shows selection of studies for meta-analysis.

TABLE 1: Characteristics of Studies Included in Meta-Analysis Inclusion Criteria for Elasticity Score No. of Male Mean Patient Mean Nodule Nodule Size Scale (Threshold Patients Age (y) Size (mm) (mm) for Malignancy)

Year Study Was Published

No. of Nodules

No. of Patients

Asteria [16]

2008

86

66

12

NR

21.30

> 10

1–4 (2–3)

Botha

Bhatia [24]

2011

85

NR

NR

NR

21.00

NR

1–4 (NR)

Cytology

Cakir [25]

2011

396

292

50

46.08

NR

NR

1–5 (3–4)

Botha

First Author [Reference No.]

Reference Standard

Friedrich-Rust [26]

2010

53

50

13

NR

NR

NR

1–4 (2–3)

Cytology

Gietka-Czernel [27]

2010

53

NR

NR

NR

NR

NR

1–5 (3–4)

Botha

Kagoya [28]

2010

47

44

13

61.80

19.40

< 40

1–4 (2–3)

Botha

Rago [1]

2007

92

92

29

NR

NR

NR

1–5 (3–4)

Histopathology

Rubaltelli [29]

2009

51

40

15

55.00

NR

NR

1–4 (2–3)

Botha

Tranquart [30]

2008

108

96

11

58.00

NR

NR

1–4 (2–3)

Cytology

Wang [31]

2010

51

51

13

48.60

8.96

< 10

1–5 (3–4)

Histopathology

Xie [32]

2011

60

47

NR

NR

NR

NR

1–4 (NR)

Histopathology

Xing [12]

2011

98

86

15

47.00

13.30

< 40

1–4 (2–3)

Histopathology

Note—NR = not reported. aCytology and histopathology.

AJR:202, April 2014 W381

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

Ghajarzadeh et al.

Fig. 2—Graph shows reviewers’ judgments about each methodologic quality item presented as percentages for all 12 studies included in meta-analysis.

In three studies, only FNA results were considered as the reference standard; in four studies, histopathologic assessment of all nodules was available; and in five studies, a combination of these two methods was used (Table 1). There was heterogeneity in the assessment methods and classification of elasticity scores in individual studies. In eight studies [12, 16, 24, 26, 28–30, 32], a 4-point scale was used, and in four studies [1, 25, 27, 31], a 5-point scale was used. The threshold value for assessing malignancy was considered to be an elasticity score between 2 and 3 in six studies [12, 16, 26, 28– 30] and between 3 and 4 in four other studies [1, 25, 27, 31]. This information was not provided for the two remaining studies [24, 32]. Methodologic Quality of the Included Studies Figure 2 illustrates the methodologic quality of the included studies based on the 11 items of QUADAS. A summary of the risk of bias and of compliance of individual studies to these items is shown in Figure 3. Data for the additional nine items of methodologic quality are provided in Table 2. All studies used an acceptable reference standard independent of the index test. Two studies [12, 31] did not use a representative sample of patients. Only two studies [1, 31] reported the data on blinding of the results of the reference standard. In eight studies [1, 12, 16, 24–26, 31, 32], the reference standard was interpreted blinded to the results of sonoelastography. For the other four studies [27–30], data were not provided. In one study [27], the withdrawals from the study were not clearly described. For

W382

Fig. 3—Graphic shows reviewers’ judgments about each risk-of-bias item for each study included in meta-analysis. Quality is represented by colors using green (+) as yes (high quality), yellow (?) as unclear, and red (–) as no (low quality).

three studies [12, 16, 24], performers of sonoelastography had received appropriate training, whereas in the other studies [1, 25– 32], these data were not provided. Four studies [24–26, 28] were reported to be free of commercial funding, but in the other eight studies [1, 12, 16, 27, 29–32], there were no data about the source of funding. Diagnostic Accuracy in Different Threshold Values Threshold elasticity score between 1 and 2—The data obtained from nine studies [1, 16, 24–27, 29, 31, 32] with 927 thyroid nodules were used for calculating the diagnostic accuracy for the threshold between an elasticity score of 1 and 2. Pooled estimates for sensitivity and specificity were 98.3% (95% CI, 96.2–99.5) and 19.6% (95% CI, 16.6–23.0), respectively (Figs. 4A and 4B). This threshold had positive and negative LRs of 1.26 (95% CI, 1.09–1.44) and 0.14 (95% CI, 0.07–0.27), respectively, with a diagnostic odds ratio (OR)

of 8.73 (95% CI, 4.45–17.11) (Figs. 4C–4E). The AUC for the SROC was 0.8820 (SE, 0.1329) (Fig. 4F). Threshold elasticity score between 2 and 3 —The results of all 12 included studies were used for calculating the diagnostic accuracy of sonoelastography with the threshold for detecting malignancy between elasticity scores 2 and 3. Sensitivity and specificity were 86.0% (95% CI, 81.9–89.4) and 66.7% (95% CI, 63.4–69.9), respectively (Figs. 5A and 5B). The positive LR, negative LR, and diagnostic OR were 3.82 (95% CI, 2.38–6.13), 0.16 (95% CI, 0.08–0.32), and 27.5 (95% CI, 9.21–82.18) (Figs. 5C–5E). The AUC for this threshold was 0.8769 (SE, 0.0570) (Fig. 5F). Threshold elasticity score between 3 and 4 —The data extracted from nine studies [1, 16, 24–27, 29, 31, 32] were used for calculating the diagnostic accuracy for the threshold between elasticity scores 3 and 4. Sensitivity, specificity, positive LR, negative LR, and diagnostic

AJR:202, April 2014

Sonoelastography of Thyroid Nodules

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

TABLE 2: Summary of Risk of Bias for Nine Additional Items Assessing the Methodologic Quality of the Studies Included in the Meta-Analysis First Author [Reference No.]

Thresholds Technology Established? Unchanged?

Positive Results Defined?

Appropriate Training?

Treatment Withheld?

Observer Variation Reported?

Instrument Variation Reported?

Free of Objectives Commercial Prespecified? Funding?

Asteria [16]

Yes

Yes

Yes

Yes

Yes

Unclear

Yes

Yes

Unclear

Bhatia [24]

Yes

Yes

Yes

Yes

Yes

Yes

Yes

Unclear

Yes

Cakir [25]

Yes

Yes

Yes

Unclear

Yes

Yes

Yes

Yes

Yes

Friedrich-Rust [26]

Yes

Yes

Yes

Unclear

Yes

Yes

Yes

Yes

Yes

Gietka-Czernel [27]

No

Yes

Yes

Unclear

Yes

No

Yes

Yes

Unclear

Kagoya [28]

Yes

Yes

Yes

Unclear

Yes

Unclear

Yes

Yes

Yes

Rago [1]

Yes

Yes

Yes

Unclear

Yes

Unclear

Yes

Yes

Unclear

Rubaltelli [29]

Yes

Yes

Yes

Unclear

Yes

Unclear

Yes

Yes

Unclear

Tranquart [30]

Yes

Yes

Yes

Unclear

Yes

Unclear

Yes

Yes

Unclear

Wang [31]

No

Yes

Yes

Unclear

Yes

Unclear

Yes

Yes

Unclear

Xie [32]

Yes

Yes

Yes

Unclear

Unclear

Yes

Yes

Yes

Unclear

Xing [12]

No

Yes

Yes

Yes

Yes

No

Yes

Yes

Unclear

OR for this threshold were 66.8% (95% CI, 61.1–72.1), 83.9% (95% CI, 80.7–86.7), 8.09 (95% CI, 3.43–19.08), 0.38 (95% CI, 0.26– 0.55), and 27.17 (95% CI, 8.42–87.74), respectively (Figs. 6A–6E). This threshold had an AUC of 0.8878 (SE, 0.0945) (Fig. 6F). Selecting the best threshold elasticity score for detecting malignancy—The summary estimates of diagnostic accuracy for different sonoelastography threshold values in detecting malignancy are presented in Table 3. Considering the measures of diagnostic accuracy for each threshold, both threshold values between elasticity scores 2 and 3 and between elasticity scores 3 and 4 have almost similar statistical diagnostic value in detecting malignancy. Heterogeneity in diagnostic measures between studies—All statistical measures of diagnostic accuracy in the original studies showed significant heterogeneity (I2 > 50%) except for sensitivity (I2 = 6.8%), negative LR (I2 = 0.0%), and diagnostic OR (I2 = 0.0%) for the threshold between elasticity scores 1 and 2. Data on heterogeneity are presented in the corresponding forest plots (Figs. 4–6).

Discussion Thyroid nodules commonly occur, especially in iodine-deficient geographic areas. In fact, up to 5% of the general population has palpable nodules, and thyroid nodules will be present in up to 50% of the general population at autopsy or sonography [1, 16]. However, thyroid nodules will rarely undergo malignant transformation [1]. Although the diagnosis of thyroid nodules is usually a medical challenge, the main diagnostic method used is cytologic analysis of a sample obtained from invasive FNA. Besides its invasive nature, FNA has limitations. The cytologic results of up to 30% of FNA samples from thyroid nodules are not conclusive, with 10–15% of FNA samples yielding nondiagnostic results and 10–20% yielding indeterminate results. These diagnostic failures are due to inadequate cell sampling, hemorrhagic cysts, nodules smaller than 1 cm or larger than 4 cm, or multinodular goiter [16]. In fact, the specificity of FNA cytology is high as opposed to its sensitivity [1]. In some FNA samples with adequate cells, the distinction be-

tween benign and malignant cells is difficult or impossible, especially differentiating a benign follicular adenoma from a follicular carcinoma or a follicular papillary carcinoma [1]. Depending on the cytologist’s experience, up to 30% of malignant thyroid nodules can be missed especially if nodules are small or are dorsally located nodules or if the patient has multinodular goiter or a fibrotic thyroid [1, 33]. Conventional sonography can accurately detect thyroid nodules, but it is not very accurate in determining the nature of the nodules. Various B-mode sonography findings have been used for detecting malignant thyroid nodules including hypoechogenicity, indeterminate (blurred or spiculated) margins, spot microcalcifications, nodule measurements (i.e., anteroposterior diameter–lateral diameter ratio > 1), lack of halo, local invasion, and increased vascularity has been used for detecting malignant nodules in Doppler sonography [33]. Considering these indexes individually or in combination has not yielded enough accuracy for the diagnosis of malignant thyroid nodules [33].

TABLE 3: Summary of Pooled Estimates of Diagnostic Performance of Different Elasticity Score Thresholds for Detecting Malignant Thyroid Nodules Elasticity Score Threshold for Malignancy

No. of Studies

Sensitivity, % (95% CI)

Specificity, % (95% CI)

Positive LR (95% CI)

Negative LR (95% CI)

Diagnostic OR (95% CI)

AUC (SE)

1–2

9

98.3 (96.2–99.5)

19.6 (16.6–23.0)

1.26 (1.09–1.44)

0.14 (0.07–0.27)

8.73 (4.45–17.11)

0.8820 (0.1329)

2–3

12

86.0 (81.9–89.4)

66.7 (63.4–69.9)

3.82 (2.38–6.13)

0.16 (0.08–0.32)

27.51 (9.21–82.18)

0.8769 (0.0570)

3–4

9

66.8 (61.1–72.1)

83.9 (80.7–86.7)

8.09 (3.43–19.08)

0.38 (0.26–0.55)

27.17 (8.42–87.74)

0.8878 (0.0945)

Note—LR = likelihood ratio, OR = odds ratio, AUC = area under the curve, SE = standard error.

AJR:202, April 2014 W383

Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Rago [1] Rubaltelli [29] Wang [31] Xie [32]

0

0.2

0.4

0.6

0.8

1

Sensitivity (95% CI) 1.00 (0.80–1.00) 0.92 (0.73–0.99) 0.98 (0.93–1.00) 1.00 (0.59–1.00) 1.00 (0.85–1.00) 1.00 (0.89–1.00) 1.00 (0.72–1.00) 1.00 (0.89–1.00) 1.00 (0.87–1.00)

Pooled sensitivity (%) = 98.3 (96.2–99.5) Chi-square = 8.58; df = 8 (p = 0.3788) Inconsistency (I2) = 6.8%

Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Rago [1] Rubaltelli [29] Wang [31] Xie [32]

0

Sensitivity

0.2

0.4

0.6

0.8

1

Specificity (95% CI) 0.16 (0.08–0.27) 0.28 (0.17–0.41) 0.10 (0.07–0.14) 0.07 (0.01–0.18) 0.19 (0.07–0.37) 0.67 (0.54–0.79) 0.20 (0.09–0.36) 0.26 (0.09–0.51) 0.10 (0.05–0.32)

Pooled specificity (%) = 19.6 (16.6–23.0) Chi-square = 93.10; df = 8 (p = 0.0000) Inconsistency (I2) = 91.4%

Specificity

A Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Rago [1] Rubaltelli [29] Wang [31] Xie [32]

0.01

Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Rago [1] Rubaltelli [29] Wang [31] Xie [32]

Positive LR (95% CI) 1.16 (1.02–1.32) 1.27 (1.04–1.55) 1.09 (1.04–1.14) 1.01 (0.83–1.23) 1.23 (1.02–1.48) 2.98 (2.08–4.25) 1.21 (0.99–1.47) 1.36 (1.03–1.79) 1.17 (1.00–1.37)

Random-effects model Pooled positive LR = 1.26 (1.09–1.44) Cochran Q = 53.18; df = 8 (p = 0.0000) 100.0 Inconsistency (I2) = 85.0% Tau-square = 0.0322

1

Positive LR

B

0.01

1

100.0

Negative LR (95% CI) 0.17 (0.01–2.74) 0.30 (0.07–1.20) 0.23 (0.07–0.74) 0.84 (0.05–14.76) 0.11 (0.01–1.81) 0.02 (0.00–0.37) 0.20 (0.01–3.24) 0.06 (0.00–0.94) 0.11 (0.01–1.91)

Fixed-effects model Pooled negative LR = 0.14 (0.07–0.27) Cochran Q = 5.51; df = 8 (p = 0.7018) Inconsistency (I2) = 0.0%

Negative LR

C

D 1 0.9

Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Rago [1] Rubaltelli [29] Wang [31] Xie [32]

0.01

1

100.0

Diagnostic OR (95% CI) 6.88 (0.39–122.73) 4.25 (0.90–20.06) 4.78 (1.42–16.07) 1.21 (0.06–25.81) 11.47 (0.61–215.20) 127.54 (7.43–2190.30) 6.02 (0.32–112.70) 24.66 (1.28–476.03) 10.61 (0.56–201.19)

0.8 Sensitivity

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

Ghajarzadeh et al.

0.7 0.6 0.5 0.4 0.3

Fixed-effects model Pooled diagnostic odds ratio = 8.73 (4.45–17.11) Cochran Q = 7.40; df = 8 (p = 0.4937) Inconsistency (I2) = 0.0%

Diagnostic Odds Ratio

Symmetric SROC AUC = 0.8820 SE (AUC) = 0.1329 Q* = 0.8125 SE (Q*) = 0.1349

0.2 0.1 0

0

E

0.2

0.4 0.6 1 – Specificity

0.8

1

F

Fig. 4—Diagnostic measures for sonoelastography with threshold for malignancy between elasticity scores 1 and 2. A–E, Forest plots illustrate pooled estimates (diamonds) for sensitivity (A), specificity (B), positive likelihood ratio (LR) (C), negative LR (D), and diagnostic odds ratio (E) and corresponding 95% CIs for pooled estimates. Data for each individual study (squares) are presented as point estimates and 95% CI as well as weight of each original study in meta-analysis. df = degrees of freedom. F, Summary receiver operating characteristic (SROC) plot for assessing accuracy with corresponding curves indicative of upper and lower bounds of 95% CI. AUC = area under curve, SE = standard error, Q* = summary measure of accuracy derived from the SROC curve.

One nodule characteristic that can help determine its nature is firmness or softness (elasticity). It has been shown that the stiffness of malignant thyroid neoplasms is 10-fold greater than that of normal tissues [16]. Thus, malignant thyroid nodules are more firm than benign ones. Classically, the tissue firmness or softness, which is the degree of tissue deformation in response to applying a certain external force, is usually being examined by palpation, which is very subjective and depends on the examin-

W384

er’s experience and the size and location of the nodule [31]. In recent years, sonoelastography has been used for determining the nature of tissue, assessing its firmness and softness in some areas such as breast masses, prostate tissues, lymph nodes, and thyroid nodules, and some original studies have been published on its efficacy in diagnosing thyroid nodules. This assessment has been done in different ways, but the most common method is reporting the firmness as

a score on an ordinal scale by qualitative assessment of the elasticity and color patterns of the nodules. For this systematic review and metaanalysis, we assessed the efficacy of sonoelastography for determining the nature of thyroid nodules. Pooled diagnostic indexes were assessed for different thresholds according to different elasticity scores. Adjusting the threshold between the most elastic nodules and the other ones (between elasticity scores of 1 and 2) yielded an excel-

AJR:202, April 2014

Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Kagoya [28] Rago [1] Rubaltelli [29] Tranquart [30] Wang [31] Xie [32] Xing [12]

0

0.2

0.4

0.6

0.8

1

Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Kagoya [28] Rago [1] Rubaltelli [29] Tranquart [30] Wang [31] Xie [32] Xing [12]

Sensitivity (95% CI) 0.94 (0.71–1.00) 0.75 (0.53–0.90) 0.76 (0.68–0.83) 0.86 (0.42–1.00) 0.95 (0.77–1.00) 0.73 (0.39–0.94) 1.00 (0.54–1.00) 0.82 (0.48–0.98) 1.00 (0.54–1.00) 1.00 (0.89–1.00) 0.96 (0.81–1.00) 0.89 (0.76–0.96)

Pooled sensitivity (%) = 86.0 (81.9–89.4) Chi-square = 40.09; df = 11 (p = 0.0000) Inconsistency (I2) = 72.6%

0

0.2

0.4

0.6

0.8

1

Specificity (95% CI) 0.81 (0.70–0.90) 0.56 (0.42–0.68) 0.41 (0.35–0.47) 0.87 (0.74–0.95) 0.71 (0.52–0.86) 0.64 (0.46–0.79) 0.80 (0.68–0.89) 0.88 (0.73–0.96) 0.93 (0.86–0.97) 0.63 (0.38–0.84) 0.85 (0.68–0.95) 0.81 (0.68–0.91)

Pooled specificity (%) = 66.7 (63.4–69.9) Chi-square = 163.63; df = 11 (p = 0.0000) Inconsistency (I2) = 93.3%

Specificity

Sensitivity

A Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Kagoya [28] Rago [1] Rubaltelli [29] Tranquart [30] Wang [31] Xie [32] Xing [12]

0.01

1

100.0

B Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Kagoya [28] Rago [1] Rubaltelli [29] Tranquart [30] Wang [31] Xie [32] Xing [12]

Positive LR (95% CI) 5.00 (3.02–8.27) 1.69 (1.18–2.44) 1.28 (1.12–1.47) 6.57 (2.94–14.70) 3.29 (1.88–5.74) 2.01 (1.14–3.55) 4.88 (2.97–8.03) 6.55 (2.75–15.56) 12.75 (6.21–26.18) 2.63 (1.49–4.46) 6.63 (2.83–14.40) 4.71 (2.67–8.31)

Random-effects model Pooled positive LR = 3.82 (2.38–6.13) Cochran Q = 163.14; df = 11 (p = 0.0000) Inconsistency (I2) = 91.9% Tau-square = 0.6082

0.01

1

100.0

Negative LR (95% CI) 0.07 (0.01–0.49) 0.45 (0.22–0.93) 0.59 (0.42–0.83) 0.16 (0.03–1.01) 0.06 (0.01–0.44) 0.43 (0.16–1.16) 0.02 (0.00–0.31) 0.21 (0.06–0.73) 0.08 (0.01–1.11) 0.02 (0.00–0.39) 0.04 (0.01–0.30) 0.14 (0.06–0.32)

Random-effects model Pooled negative LR = 0.16 (0.08–0.32) Cochran Q = 45.13; df = 11 (p = 0.0000) Inconsistency (I2) = 75.6% Tau-square = 0.9170

Negative LR

Positive LR

C Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Kagoya [28] Rago [1] Rubaltelli [29] Tranquart [30] Wang [31] Xie [32] Xing [12]

0.01

Diagnostic OR (95% CI) 68.92 (8.37–567.63) 3.78 (1.32–10.83) 2.18 (1.36–3.50) 40.00 (4.07–392.75) 51.33 (5.97–441.03) 4.72 (1.06–20.96) 249.48 (14.26–4364.11) 31.50 (5.23–189.80) 165.53 (8.48–3229.54) 108.33 (5.75–2041.24) 145.60 (15.93–1330.44) 34.40 (10.82–109.37)

Random-effects model Pooled diagnostic odds ratio = 27.51 (9.21–82.18) Cochran Q = 67.26; df = 11 (p = 0.0000) 1 100.0 Inconsistency (I2) = 83.6% Diagnostic Odds Ratio Tau-square = 2.7312

D 1 0.9 0.8 Sensitivity

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

Sonoelastography of Thyroid Nodules

0.7 0.6 0.5 0.4 0.3

Symmetric SROC AUC = 0.8769 SE (AUC) = 0.0568 Q* = 0.8073 SE (Q*) = 0.0570

0.2 0.1 0

0

E

0.2

0.4 0.6 1 – Specificity

0.8

1

F

Fig. 5—Diagnostic measures for sonoelastography with threshold for malignancy between elasticity scores 2 and 3. A–E, Forest plots illustrate pooled estimates (diamonds) for sensitivity (A), specificity (B), positive likelihood ratio (LR) (C), negative LR (D), and diagnostic odds ratio (E) and corresponding 95% CIs for pooled estimates. Data for each individual study (squares) are presented as point estimates and 95% CI as well as weight of each original study in meta-analysis. df = degrees of freedom. F, Summary receiver operating characteristic (SROC) plot for assessing accuracy with corresponding curves indicative of upper and lower bounds of 95% CI. AUC = area under curve, SE = standard error, Q* = summary measure of accuracy derived from the SROC curve.

lent pooled sensitivity (98.3%), but a very low specificity (19.6%). Thus, this threshold has a high negative predictive value (NPV), which means that an elasticity score equal to 1 could be indicative of a benign nodule that is not a candidate for FNA or further invasive or expensive procedures and requires only observation and follow-up. The impor-

tance of this finding is better understood when we consider the low prevalence of malignancy in thyroid nodules. It has been reported that up to 95% of thyroid nodules are benign and only the remaining 5% are malignant [2–11]. Therefore, when we consider a group of 1000 thyroid nodules, according to the mentioned data, 950 will be benign.

Of those 950 benign nodules, approximately 186 will have an elasticity score of 1, and according to the pooled results of sensitivity and specificity, the NPV of an elasticity score of 1 is about 0.995. On the other hand, if we do not consider FNA for thyroid nodules with an elasticity score of 1, the number of missed cancer patients will be one

AJR:202, April 2014 W385

Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Rago [1] Rubaltelli [29] Wang [31] Xie [32]

0

0.2

0.4

0.6

0.8

1

Sensitivity (95% CI) 0.59 (0.33–0.82) 0.42 (0.22–0.63) 0.60 (0.51–0.68) 0.57 (0.18–0.90) 0.86 (0.65–0.97) 0.97 (0.83–1.00) 0.55 (0.23–0.83) 0.91 (0.75–0.98) 0.56 (0.35–0.75)

Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Rago [1] Rubaltelli [29] Wang [31] Xie [32]

Pooled sensitivity (%) = 66.8 (61.1–72.1) Chi-square = 44.01; df = 8 (p = 0.0000) Inconsistency (I2) = 81.8%

0

Sensitivity

0.2

0.4

0.6

0.8

1

Specificity (95% CI) 0.94 (0.86–0.98) 0.82 (0.70–0.91) 0.71 (0.65–0.76) 0.96 (0.85–0.99) 0.97 (0.83–1.00) 1.00 (0.94–1.00) 0.90 (0.76–0.97) 0.89 (0.67–0.99) 1.00 (0.89–1.00)

Pooled specificity (%) = 83.9 (80.7–86.7) Chi-square = 80.95; df = 8 (p = 0.0000) Inconsistency (I2) = 90.1%

Specificity

A Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Rago [1] Rubaltelli [29] Wang [31] Xie [32]

0.01

1

Positive LR

B

Positive LR (95% CI) 10.15 (3.62–28.45) 2.31 (1.13–4.72) 2.07 (1.64–2.62) 13.14 (2.93–58.88) 26.77 (3.87–185.42) 118.19 (7.47–1,870.49) 5.45 (1.86–15.98) 8.61 (2.31–32.09) 37.64 (2.36–601.57)

Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Rago [1] Rubaltelli [29] Wang [31] Xie [32]

Random-effects model Pooled positive LR = 8.09 (3.43–19.08) Cochran Q = 49.77; df = 8 (p = 0.0000) 100.0 Inconsistency (I2) = 83.9% Tau-square = 1.1860

0.01

1

100.0

Negative LR (95% CI) 0.44 (0.25–0.77) 0.71 (0.50–1.02) 0.56 (0.45–0.70) 0.45 (0.19–1.06) 0.14 (0.05–0.40) 0.05 (0.01–0.23) 0.51 (0.26–0.97) 0.10 (0.04–0.31) 0.45 (0.30–0.69)

Random-effects model Pooled negative LR = 0.38 (0.26–0.55) Cochran Q = 33.40; df = 8 (p = 0.0001) Inconsistency (I2) = 76.0% Tau-square = 0.2172

Negative LR

C

D 1 0.9

Study Asteria [16] Bhatia [24] Cakir [25] Friedrich-Rust [26] Gietka-Czernel [27] Rago [1] Rubaltelli [29] Wang [31] Xie [32]

0.01

Diagnostic OR (95% CI) 23.21 (5.74–93.88) 3.25 (1.15–9.20) 3.68 (2.37–5.72) 29.33 (3.73–230.45) 190.00 (18.39–1962.59) 2,501.00 (98.94–63,219.20) 10.80 (2.24–52.09) 82.17 (12.45–542.10) 83.08 (4.62–1495.06)

Random-effects model Pooled diagnostic odds ratio = 27.17 (8.42–87.74) Cochran Q = 46.71; df = 8 (p = 0.0000) 1 100.0 Inconsistency (I2) = 82.9% Tau-square = 2.3087 Diagnostic Odds Ratio

0.8 Sensitivity

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

Ghajarzadeh et al.

0.7 0.6 0.5 0.4 0.3

Symmetric SROC AUC = 0.8878 SE (AUC) = 0.0917 Q* = 0.8184 SE (Q*) = 0.0945

0.2 0.1 0

0

E

0.2

0.4 0.6 1 – Specificity

0.8

1

F

Fig. 6—Diagnostic measures for sonoelastography with threshold for malignancy between elasticity scores 3 and 4. A–E, Forest plots illustrate pooled estimates (diamonds) for sensitivity (A), specificity (B), positive likelihood ratio (LR) (C), negative LR (D), and diagnostic odds ratio (E) and corresponding 95% CIs for pooled estimates. Data for each individual study (squares) are presented as point estimates and 95% CI as well as weight of each original study in meta-analysis. df = degrees of freedom. F, Summary receiver operating characteristic (SROC) plot for assessing accuracy with corresponding curves indicative of upper and lower bounds of 95% CI. AUC = area under curve, SE = standard error, Q* = summary measure of accuracy derived from the SROC curve.

(among the 50 malignant nodules). Given the slow growth of thyroid nodules and their noninvasive nature, malignant nodules with an elasticity score of 1 could be detected as suspicious in follow-ups probably without a catastrophic delay in diagnosis and up to 186 benign nodules (≈ 19.6% of benign nodules and 18.6% of total nodules) will be exempt from an unnecessary invasive FNA, resulting in reduced psychologic and economic burdens. This trade-off can justify

W386

the exemption of nodules with an elasticity score of 1 from FNA. Because the NPV is not 100%, some malignant nodules will be missed at the first assessment; however, appropriate follow-up studies can probably help in the early diagnosis of slow-growing nodules without a considerable time delay. For a better understanding of our results, we can apply this approach to the other elasticity score thresholds; we suppose a population of 1000 thyroid nodules and seek the

diagnostic indexes of sonoelastography using these thresholds according to the results of this study. In the threshold between elasticity scores 2 and 3, the number of true-positive, false-negative, true-negative, and falsepositive results will be 43, seven, 634, and 316 nodules, respectively. These figures for the threshold value between elasticity scores 3 and 4 will be 33, 17, 797, and 153 nodules, respectively. According to these data, the NPV and positive predictive value (PPV) of

AJR:202, April 2014

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

Sonoelastography of Thyroid Nodules these thresholds will be 0.989 and 0.119 for the threshold between elasticity scores 2 and 3 and 0.979 and 0.177 for the other threshold. As we can see, the PPVs are not greater than 0.2 in all thresholds (despite the better specificity of the higher thresholds); the PPVs are low because of the relatively low prevalence of malignant nodules in comparison with benign ones. On the other hand, the NPVs of all thresholds are high, and the best threshold value seems to be the first threshold (i.e., between elasticity scores 1 and 2) because it can save many patients from undergoing unnecessary invasive FNA without a considerable number of malignant nodules being missed (only one among 50 malignant nodules will be missed). The numbers of missed malignant nodules using the other thresholds are considerable: seven and 17 missed malignant nodules. Consequently, sonoelastography can be considered as a good adjunctive screening tool in the assessment of thyroid nodules. The follow-up sonoelastography and physical examination could also be helpful if the patient does not undergo FNA. With the threshold between elasticity scores 2 and 3, the sensitivity will be lower but still good (86.0%); however, the specificity improves considerably (66.7%). In the threshold between elasticity scores 3 and 4, the sensitivity is fair (66.8%), but the specificity is good (83.9%). Although these two thresholds show good or fair sensitivity and specificity, which indicate good efficacy of sonoelastography for differentiating malignant from benign thyroid nodules, neither can be used for decision making because each has considerable false-negative and false-positive rates. Considering the malignant nature of the disease, sensitivity seems to be more important than specificity. However, the threshold between elasticity scores 2 and 3 would be more reliable than that between elasticity scores 3 and 4. In some disciplines, the combination of B-mode sonographic findings with sonoelastography has been evaluated in differentiating malignant from benign masses (such as breast lesions). This combination has not been implemented for thyroid nodules to our knowledge. One useful suggestion for improving the accuracy of sonoelastography is to combine it with B-mode sonography and to assess how sonoelastography improves the diagnostic efficacy of sonography. One important issue in the current study is that different centers did not use histopathol-

ogy as the reference standard for confirming nodule malignancy. Instead, some used cytologic analysis of FNA samples for this purpose. As investigators have mentioned before, FNA cytology is not as accurate as histopathology and can yield indeterminate results or miss malignancy in some cases. Indeterminate or false-positive results are not important issues because these cases will undergo further evaluation and the results will be confirmed by histopathology, but the false-negative cases are a source of bias in these studies. Moreover, the nodule type can affect the sonoelastography assessment and results. For example, cystic nodules, because of their physical properties, could affect the sonoelastography assessments, which would be important for the homogeneity of the assessments and results. In addition, recruiting different populations of patients can affect the homogeneity. For example, some studies assessed patients who were candidates for surgery and histopathology. These studies show higher probabilities of malignancy among their populations compared with other studies with different patient populations. This difference in study populations is seen in the current meta-analysis and can be one source of variation among the different studies. The quality of the included studies is acceptable. Two issues compromising methodologic quality are the time elapsed between the index test and the reference standard and the blinding of the reference standard results. The blinding of the reference standard does not seem to be an important issue for the studies in our meta-analysis because in most cases sonoelastography was performed before cytologic or histopathologic evaluation and because sonoelastography results were evaluated simultaneously as the test was being performed (except in a few studies in which another examiner assessed the sonoelastography results offline after performing the test). Other important issues in sonoelastography are operator experience and reliability. Because sonoelastography is an operatordependent procedure, the results could have been affected by the experience of the operator. In most studies, the training duration and experience of the operator—which can affect the results—were not clear and the intrarater reliability was not reported. These factors might be interpreted as potentially affecting the results. The classic qualitative sonoelastography examination consists of manual compression by the operator on the tissue;

thus, the quality of the examination will depend on the compression performance. A good study needs an optimum compression technique. Hence, the method will be operator-dependent and the reproducibility of the technique is not established [53]. In recent years, some quantitative approaches have been introduced for sonoelastography that are reproducible and operator-independent [48]. The strain ratio measurement is a quantitative method of assessment in which the elasticity of a lesion is estimated relative to the surrounding tissue. By using strain index values and a discriminatory threshold value, reviewers can categorize nodules as malignant or benign [25]. Another technique is shear-wave elastography. This method uses pushing beams, focused ultrasound pulses, or focused ultrasonic beams to stimulate tissues. The beams are transmitted at increasing depths to generate a shear wave that is propagated transversely in the whole imaging area. According to Young’s modulus formula, the shear-wave propagation velocity correlates with the stiffness of the tissue, and this relationship is the basis of elastography. These data are converted into a color-coded image, and an elastic index—a quantitative measurement—is extracted and expressed in kilopascals [48, 53]. Sebag et al. [48] reported that the mean elastic index in malignant nodules was statistically higher than that in benign nodules (mean ± SD, 150 ± 95 vs 36 ± 30, respectively) and that the AUC of the receiver operating characteristic curve was 0.936 (95% CI, 0.869–1.000). In addition, at the threshold value of 65 kPa, the sensitivity, specificity, PPV, and NPV of the method for the detection of malignant nodules were 85%, 94%, 80%, and 96%, respectively [48]. The only systematic review assessing the qualitative sonoelastography criteria for thyroid nodule evaluation is a study published in 2010 that was conducted by Bojunga et al. [33]. They evaluated eight studies, five of which are also included in our study [1, 16, 26, 29, 30]. The other three studies [42, 47, 51] were excluded from our study for reasons such as a different method of evaluation (i.e., carotid pulsation) or a different method of thyroid nodule classification was used. Bojunga et al. reported their results based on threshold elasticity scores between 2 and 3. They found a pooled sensitivity of 92% and pooled specificity of 90%, which are higher than in our study. In addition to the fact that we assessed different studies in our me-

AJR:202, April 2014 W387

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

Ghajarzadeh et al. ta-analysis than those in the Bojunga et al. study, we selected the studies for our metaanalysis using a different method of assessment, which could have contributed to our different results. This latter point (i.e., different method of assessment) could affect the homogeneity of the studies included in the Bojunga et al. meta-analysis. A shortcoming of our study is the lack of sensitivity analysis and meta-regression. In addition, we did not assess publication bias as a source of bias in the results. By performing meta-regression analyses, the difference in the baseline characteristics and clinical factors can be addressed and the independent role of each predictor can be determined in the overall diagnostic accuracy of sonoelastography. Any difference in the patients’ characteristics, methodologic quality of the included studies, heterogeneity in the reference standard methods (FNA cytology or histopathology), and other important characteristics can be adjusted in the meta-analysis. One important limitation of sonoelastography is that this option is not readily available on all ultrasound machines; thus, sonoelastography is not available in all centers. In fact, sonoelastography is considered as an additional ability of the machine, so it is expensive. Designing further studies in this field seems beneficial. We suggest combining sonoelastography with other conventional or Doppler ultrasound indexes to assess whether the addition of sonoelastography could improve the diagnostic efficacy of ultrasound in this setting. In addition, regarding the physical basis of sonoelastography, defining other physical parameters and including them in the assessments could be of value for improving the diagnostic efficacy of sonoelastography. Conclusion Sonoelastography is an accurate and noninvasive adjunctive method for the assessment of thyroid nodules. In nodules with an elasticity score of 1, malignancy can be excluded in almost all cases, and exclusion of these cases could decrease the need for invasive FNA in many patients. References 1. Rago T, Santini F, Scutari M, Pinchera A, Vitti P. Elastography: new developments in ultrasound for predicting malignancy in thyroid nodules. J Clin Endocrinol Metab 2007; 92:2917–2922 2. Tunbridge WM, Evered DC, Hall R, et al. The

W388

spectrum of thyroid disease in a community: the Whickham survey. Clin Endocrinol (Oxf) 1977; 7:481–493 3. Mazzaferri EL. Management of a solitary thyroid nodule. N Engl J Med 1993; 328:553–559 4. Ezzat S, Sarti DA, Cain DR, Braunstein GD. Thyroid incidentalomas: prevalence by palpation and ultrasonography. Arch Intern Med 1994; 154:1838–1840 5. Tan GH, Gharib H. Thyroid incidentalomas: management approaches to nonpalpable nodules discovered incidentally on thyroid imaging. Ann Intern Med 1997; 126:226–231 6. Rago T, Chiovato L, Aghini-Lombardi F, Grasso L, Pinchera A, Vitti P. Non-palpable thyroid nodules in a borderline iodine-sufficient area: detection by ultrasonography and follow-up. J Endocrinol Invest 2001; 24:770–776 7. Vander JB, Gaston EA, Dawber TR. The significance of nontoxic thyroid nodules: final report of a 15-year study of the incidence of thyroid malignancy. Ann Intern Med 1968; 69:537–540 8. Aghini-Lombardi F, Antonangeli L, Martino E, et al. The spectrum of thyroid disorders in an iodine-deficient community: the Pescopagano survey. J Clin Endocrinol Metab 1999; 84:561–566 9. Jemal A, Tiwari RC, Murray T, et al. Cancer statistics, 2004. CA Cancer J Clin 2004; 54:8–29 10. Jemal A, Murray T, Ward E, et al. Cancer statistics, 2005. CA Cancer J Clin 2005; 55:10–30 11. Siegel R, Naishadham D, Jemal A. Cancer statistics, 2012. CA Cancer J Clin 2012; 62:10–29 12. Xing P, Wu L, Zhang C, Li S, Liu C, Wu C. Differentiation of benign from malignant thyroid lesions: calculation of the strain ratio on thyroid sonoelastography. J Ultrasound Med 2011; 30:663–669 13. Cappelli C, Castellano M, Pirola I, et al. Thyroid nodule shape suggests malignancy. Eur J Endocrinol 2006; 155:27–31 14. Papini E, Guglielmi R, Bianchini A, et al. Risk of malignancy in nonpalpable thyroid nodules: predictive value of ultrasound and color-Doppler features. J Clin Endocrinol Metab 2002; 87:1941–1946 15. Tamsel S, Demirpolat G, Erdogan M, et al. Power Doppler US patterns of vascularity and spectral Doppler US parameters in predicting malignancy in thyroid nodules. Clin Radiol 2007; 62:245–251 16. Asteria C, Giovanardi A, Pizzocaro A, et al. USelastography in the differential diagnosis of benign and malignant thyroid nodules. Thyroid 2008; 18:523–531 17. Peng Y, Wang HH. A meta-analysis of comparing fine-needle aspiration and frozen section for evaluating thyroid nodules. Diagn Cytopathol 2008; 36:916–920 18. Oertel YC, Miyahara-Felipe L, Mendoza MG, Yu K. Value of repeated fine needle aspirations of the

thyroid: an analysis of over ten thousand FNAs. Thyroid 2007; 17:1061–1066 19. La Rosa GL, Belfiore A, Giuffrida D, et al. Evaluation of the fine needle aspiration biopsy in the preoperative selection of cold thyroid nodules. Cancer 1991; 67:2137–2141 20. Garra BS. Imaging and estimation of tissue elasticity by ultrasound. Ultrasound Q 2007; 23:255–268 21. Garra BS. Elastography: current status, future prospects, and making it work for you. Ultrasound Q 2011; 27:177–186 22. Itoh A, Ueno E, Tohno E, et al. Breast disease: clinical application of US elastography for diagnosis. Radiology 2006; 239:341–350 23. Reitsma JBRA, Whiting P, Vlassov VV, Leeflang MMG, Deeks JJ. Chapter 9: assessing methodological quality. In: Deeks JJ BP, Gatsonis C, ed. Cochrane handbook for systematic reviews of diagnostic test accuracy, version 100. Oxford, UK: The Cochrane Collaboration, 2009:5–21 24. Bhatia KSS, Rasalkar DP, Lee YP, et al. Cystic change in thyroid nodules: a confounding factor for real-time qualitative thyroid ultrasound elastography. Clin Radiol 2011; 66:799–807 25. Cakir B, Aydin C, Korukluoglu B, et al. Diagnostic value of elastosonographically determined strain index in the differential diagnosis of benign and malignant thyroid nodules. Endocrine 2011; 39:89–98 26. Friedrich-Rust M, Sperber A, Holzer K, et al. Real-time elastography and contrast-enhanced ultrasound for the assessment of thyroid nodules. Exp Clin Endocrinol Diabetes 2010; 118:602–609 27. Gietka-Czernel M, Kochman M, Bujalska K, Stachlewska-Nasfeter E, Zgliczyński W. Real-time ultrasound elastography: a new tool for diagnosing thyroid nodules. Endokrynol Pol 2010; 61:652–657 28. Kagoya R, Monobe H, Tojima H. Utility of elastography for differential diagnosis of benign and malignant thyroid nodules. Otolaryngol Head Neck Surg 2010; 143:230–234 29. Rubaltelli L, Corradin S, Dorigo A, et al. Differential diagnosis of benign and malignant thyroid nodules at elastosonography. Ultraschall Med 2009; 30:175–179 30. Tranquart F, Bleuzen A, Pierre-Renoult P, Chabrolle C, Sam Giao M, Lecomte P. Elastosonography of thyroid lesions. (in French) J Radiol 2008; 89:35–39 31. Wang Y, Dan HJ, Dan HY, Li T, Hu B. Differential diagnosis of small single solid thyroid nodules using real-time ultrasound elastography. J Int Med Res 2010; 38:466–472 32. Xie P, Xiao Y, Liu F. Real-time ultrasound elastography in the diagnosis and differential diagnosis of subacute thyroiditis. J Clin Ultrasound 2011; 39:435–440 33. Bojunga J, Herrmann E, Meyer G, Weber S, Zeuzem S,

AJR:202, April 2014

Downloaded from www.ajronline.org by University Hospitals of Cleveland on 04/02/14 from IP address 216.8.121.1. Copyright ARRS. For personal use only; all rights reserved

Sonoelastography of Thyroid Nodules Friedrich-Rust M. Real-time elastography for the differentiation of benign and malignant thyroid nodules: a meta-analysis. Thyroid 2010; 20:1145– 1150 34. Hegedüs L. Can elastography stretch our understanding of thyroid histomorphology? J Clin Endocrinol Metab 2010; 95:5213–5215 35. Welkoborsky HJ. Ultrasound usage in the head and neck surgeon’s office. Curr Opin Otolaryngol Head Neck Surg 2009; 17:116–121 36. Zubarev AV, Bashilov VP, Gazhonova VE, Kartavykh AA, Churkina SO, Selivanov ES. Sonoelastography in differential diagnosis of thyroid nodes. Khirurgiia (Mosk) 2011; 25–28 37. Lyshchik A, Higashi T, Asato R, et al. Elastic moduli of thyroid tissues under compression. Ultrason Imaging 2005; 27:101–110 38. Scacchi M, Andrioli M, Carzaniga C, et al. Elastosonographic evaluation of thyroid nodules in acromegaly. Eur J Endocrinol 2009; 161:607–613 39. Park SH, Kim SJ, Kim EK, Kim MJ, Son EJ, Kwak JY. Interobserver agreement in assessing the sonographic and elastographic features of malignant thyroid nodules. AJR 2009; 193:(web) W416–W423 40. Bae U, Dighe M, Dubinsky T, Minoshima S, Shamdasani V, Kim Y. Ultrasound thyroid elastography using carotid artery pulsation: prelimi-

nary study. J Ultrasound Med 2007; 26:797–805 41. Basarab A, Liebgott H, Morestin F, et al. A method for vector displacement estimation with ultrasound imaging and its application for thyroid nodular disease. Med Image Anal 2008; 12:259–274 42. Dighe M, Bae U, Richardson ML, Dubinsky TJ, Minoshima S, Kim Y. Differential diagnosis of thyroid nodules with US elastography using carotid artery pulsation. Radiology 2008; 248:662–669 43. Dighe M, Kim J, Luo S, Kim Y. Utility of the ultrasound elastographic systolic thyroid stiffness index in reducing fine-needle aspirations. J Ultrasound Med 2010; 29:565–574 44. Ding J, Cheng H, Ning C, Huang J, Zhang Y. Quantitative measurement for thyroid cancer characterization based on elastography. J Ultrasound Med 2011; 30:1259–1266 45. Luo S, Kim E-H, Dighe M, Kim Y. Screening of thyroid nodules by ultrasound elastography using diastolic strain variation. Conf Proc IEEE Eng Med Biol Soc 2009; 2009:4420–4423 46. Luo S, Kim E-H, Dighe M, Kim Y. Thyroid nodule classification using ultrasound elastography via linear discriminant analysis. Ultrasonics 2011; 51:425–431 47. Lyshchik A, Higashi T, Asato R, et al. Thyroid gland tumor diagnosis at US elastography. Radiology 2005; 237:202–211

48. Sebag F, Vaillant-Lombard J, Berbis J, et al. Shear wave elastography: a new ultrasound imaging mode for the differential diagnosis of benign and malignant thyroid nodules. J Clin Endocrinol Metab 2010; 95:5281–5288 49. Vorländer C, Wolff J, Saalabian S, Lienenlüke RH, Wahl RA. Real-time ultrasound elastography: a noninvasive diagnostic procedure for evaluating dominant thyroid nodules. Langenbecks Arch Surg 2010; 395:865–871 50. Wilson T, Chen Q, Zagzebski JA, Varghese T, VanMiddlesworth L. Initial clinical experience imaging scatterer size and strain in thyroid nodules. J Ultrasound Med 2006; 25:1021–1029 51. Hong Y, Liu X, Li Z, Zhang X, Chen M, Luo Z. Real-time ultrasound elastography in the differential diagnosis of benign and malignant thyroid nodules. J Ultrasound Med 2009; 28:861–867 52. Rago T, Scutari M, Santini F, et al. Real-time elastosonography: useful tool for refining the presurgical diagnosis in thyroid nodules with indeterminate or nondiagnostic cytology. J Clin Endocrinol Metab 2010; 95:5274–5280 53. Bhatia KS, Tong CS, Cho CC, Yuen EH, Lee YY, Ahuja AT. Shear wave elastography of thyroid nodules in routine clinical practice: preliminary observations and utility for detecting malignancy. Eur Radiol 2012; 22:2397–2406

AJR:202, April 2014 W389