MR diffusion imaging for preoperative staging of ... - Springer Link

2 downloads 67787 Views 682KB Size Report
Mar 26, 2014 - imaging . Diffusion magnetic resonance imaging . Meta-analysis . Review ..... Seo et al. 2013. 52 ... DW at 3 T is the best performer with pooled.
Eur Radiol (2014) 24:1327–1338 DOI 10.1007/s00330-014-3139-4

MAGNETIC RESONANCE

MR diffusion imaging for preoperative staging of myometrial invasion in patients with endometrial cancer: a systematic review and meta-analysis Anita Andreano & Gilda Rechichi & Paola Rebora & Sandro Sironi & Maria Grazia Valsecchi & Stefania Galimberti

Received: 31 October 2013 / Revised: 27 January 2014 / Accepted: 19 February 2014 / Published online: 26 March 2014 # European Society of Radiology 2014

Abstract Objectives To compare the diagnostic accuracy of dynamic contrast-enhanced (DCE) and diffusion-weighted (DW) MR imaging in detecting deep myometrial invasion in endometrial cancer, using surgical-pathological staging as reference standard. Methods After searching a wide range of electronic databases and screening titles/abstracts, we obtained full papers for potentially eligible studies and evaluated according to predefined inclusion criteria. Quality assessment was conducted by adapting the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) checklist. From each study, we extracted information on diagnostic performance of DW and DCE sequences. After exploring heterogeneity, we adopted a bivariate generalized linear mixed model to compare the effect of the two MR sequences jointly on sensitivity and specificity. Results Nine studies (442 patients) were considered. Significant evidence of heterogeneity was found only for specificity, both in DW and DCE imaging (I2 =70.8 % and 70.6 %). Pooled sensitivity of DW and DCE was 0.86 and specificity did not significantly differ (p=0.16) between the two sequences (DW= 0.86 and DCE=0.82). No difference was found between 3-T and 1.5-T MR. There was no evidence of publication bias. Electronic supplementary material The online version of this article (doi:10.1007/s00330-014-3139-4) contains supplementary material, which is available to authorized users. A. Andreano : P. Rebora : S. Sironi : M. G. Valsecchi (*) : S. Galimberti Center of Biostatistics for Clinical Epidemiology, Department of Health Sciences, University of Milano-Bicocca, Via Cadore 48, Monza, MB 20900, Italy e-mail: [email protected] G. Rechichi : S. Sironi Department of Radiology, S. Gerardo Hospital, Via Pergolesi 33, Monza, MB 20900, Italy

Conclusions MR diagnostic accuracy in presurgical detection of deep myometrial infiltration in endometrial cancer is high. DCE and DW imaging do not differ in sensitivity and specificity. Key Points • Myometrial invasion is the most important morphological prognostic feature of endometrial cancer • MR diagnostic accuracy in presurgical detection of deep myometrial infiltration is high • MR examination including T2 and DCE imaging is considered the reference standard • DW imaging has been increasingly employed with heterogeneous results • This meta-analysis shows that DCE and DW do not differ in diagnostic accuracy Keywords Endometrial neoplasms . Magnetic resonance imaging . Diffusion magnetic resonance imaging . Meta-analysis . Review

Abbreviations and acronyms ADC apparent diffusion coefficient AIC Akaike information criterion CI confidence interval CT computed tomography DCE dynamic contrast-enhanced DOR diagnostic odds ratio DW diffusion-weighted ESS effective sample size FIGO International Federation of Gynecology and Obstetrics GLMM generalized linear mixed model LR+ and LR− positive and negative likelihood ratio

1328

MR QUADAS-2 ROC SNR US

Eur Radiol (2014) 24:1327–1338

magnetic resonance Quality Assessment of Diagnostic Accuracy Studies-2 receiver operating characteristic signal to noise ratio ultrasound

Introduction Endometrial cancer is the sixth most frequent cancer in women worldwide, with about 290,000 new cases and 74,000 deaths in 2008 [1]. It is the most common gynaecological malignancy in western countries with 49,560 estimated new cases in 2013 in the USA and 98,900 in 2012 in Europe [2, 3]. Well-known negative prognostic factors are high tumour grade, deep myometrial invasion (≥50 % myometrial thickness), lymphovascular space invasion, non-endometrioid histology, and cervical stroma involvement [4]. Among those, the most important single morphological prognostic feature is depth of myometrial invasion with a one-half cut-off, which divides the current stage I of the International Federation of Gynecology and Obstetrics (FIGO) staging system into Ia and Ib [5, 6]. Deep myometrial invasion is associated with both pelvic lymph node involvement and extension into the parametrium [7]. To assess the depth of myometrial invasion, a number of imaging procedures have been applied, including ultrasound (US), computed tomography (CT) and magnetic resonance (MR). MR is considered superior to US and CT for the evaluation of myometrial invasion [8]. Essential sequences to determine the extent of myometrial invasion are high spatial resolution T2 images performed on multiples planes [9]. However, the detection of deep myometrial invasion at noncontrast MR has been associated with some pitfalls, including the presence of large leiomyomas, adenomyosis, and extension of the tumour into the cornua [10]. Dynamic contrast-enhanced (DCE) MR imaging helps in determining myometrial invasion in non-easily interpretable cases on T2 imaging [9]. At present, a MR examination including T2 and DCE imaging is considered the reference standard for preoperative detection of deep myometrial invasion in women with endometrial cancer [9, 11, 12]. Nevertheless, some studies reported sensitivities and specificities as low as 72 % and 44 % and a modest interobserver agreement for DCE MR [13, 14]. Diffusion-weighted (DW) imaging is increasingly used as an add-on to T2 and DCE MR imaging [15, 16], even if DW imaging is not yet included in current imaging guidelines updated by 2011 [9, 12]. DW sequences are able to measure the random motion of water in tissues, thus helping to characterize, stage and predict aggressiveness and response to therapy of several malignancies [17, 18]. Endometrial carcinoma exhibits restricted diffusion compared

to normal myometrial tissue, resulting hyperintense on high b values (500–1,000 s/mm2). The recently raised concerns about gadolinium contrast administration in patients with impaired renal function has given a supplemental reason to investigate the role of DW imaging in evaluating the depth of myometrial invasion [19]. In addition, not all tumours are hypovascular relative to the myometrium; in those cases DW imaging, which is essentially independent of differences in vascularity, can be useful to determine the depth of myometrial invasion. At the end of the 1990s, two meta-analyses investigated preoperative radiological staging of myometrial invasion, one comparing different imaging techniques and the other assessing the role of DCE-MR imaging [8, 11]. As DW imaging has been increasingly employed in the last decade in studies with relatively small sample sizes, there is need for an updated systematic review including this new generation technique. The purpose of this systematic review is to compare the diagnostic accuracy of DCE MR with DW (±T2) magnetic resonance imaging in the preoperative detection of deep myometrial invasion in patients with endometrial cancer, using the surgical-pathologic staging as a reference standard.

Materials and methods Data sources and searches Studies were identified by one of the authors (A.A.) covering a wide range of electronic databases to find ongoing studies and the so-called grey literature [20, 21] up to 6 May 2013: CINAHL (via EBSCOhost 1937 to date of search), Cochrane (all content to date of search), Embase (1980 to date of search), Global Health and Global Health Library Regional databases (1973 to date of search, excluding Pubmed), Pubmed (MEDLINE, 1950 to date of search), OpenGrey (all content to date of search), Web of Science, including Conference Proceedings Citation Index (via Web of Knowledge 1900 to date of search, excluding Pubmed). As recommended [22, 23], we did not use methodological filters in database searches in order to be as sensitive as possible. In the search we included the terms capturing the concepts of endometrial cancer (target condition) and magnetic resonance with DW sequences and DCE acquisition (index and comparator tests) [24]. No limits were applied for language. As a sample, the Medline search is presented in E1. Reference lists of full-text articles and reviews identified through electronic databases were scanned for additional studies. Study selection and data collection process One reviewer (A.A.) screened the titles and abstracts identified by the searches to exclude any obviously irrelevant

Eur Radiol (2014) 24:1327–1338

1329

article. Full papers were obtained for potentially eligible studies, and two reviewers (A.A., G.R.) independently applied the following inclusion criteria to the full text.

quality assessment criteria (E3) and a flow diagram, and resolved disagreements by discussion.

&

Statistical analysis

&

& & & & & &

Types of studies: both retrospective and prospective cohort studies. Participants: adult women with biopsy-proven primary adenocarcinoma of the endometrium, undergoing preoperative staging prior to surgery. Patients with any stage of the disease were included. Target condition: presurgical detection of deep myometrial invasion in primary endometrial adenocarcinoma. Index test: MR with DW sequences, alone or in combination with T2-weighted acquisitions, regardless of the applied protocol of acquisition. Comparator test: MR with DCE sequences, regardless of the applied protocol of acquisition. Reference standard: pathological assessment of the presence of deep myometrial invasion on the uterus removed at surgery. Type of comparison: we included only studies with a direct comparison of the index and comparator test, both fully paired or randomized. Minimum data required: presence of results sufficient to construct the 2×2 table of diagnostic performance for each MR test.

Diagnostic accuracy results, and additional useful information on patients and procedures, were retrieved from the selected primary studies independently by the same two reviewers (A.A., G.R.). The data collection form is available as Supplementary Material (E2). Disagreements generated in the process of study selection and data collection were resolved by consensus [25]. Risk of bias in individual studies Quality assessment was conducted adapting the tool provided by the Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) to this particular review [26]. The QUADAS-2 form is composed of four domains: (1) patient selection, (2) index test, (3) reference standard and (4) flow and timing. We supplied an additional domain (2b) to evaluate the comparator test, with the same items used for the index test. For each domain, the risk of bias and concerns about applicability (the latter not applying to the flow and timing domains) were analysed and rated as low risk, high risk and unclear risk. The results of the quality assessment were used for descriptive purposes to provide an evaluation of the overall quality of the included studies and to investigate potential sources of heterogeneity [26]. Two reviewers (A.A., G.R.) independently evaluated the methodological quality, using a standard form containing the

We extracted or derived information on diagnostic performance of both DW and DCE sequences from each primary study. They all considered as positives, either at pathology or at MR, patients with deep myometrial invasion, i.e. presenting with at least 50 % of the myometrium infiltrated, according to the FIGO classification [27]. For five studies, results of two independent readers were available. We made a complete description of the findings, but we considered for inference only the results of one (randomly selected) reader from each trial. We graphically explored heterogeneity drawing forest plots of sensitivity and specificity of each primary study and plotting them in the receiver operating characteristic (ROC) space, the latter to explore if any heterogeneity could be attributable to an implicit threshold effect. We then formally assessed the presence of heterogeneity by means of a test on the Q statistic and calculated the I2 index [28, 29]. In order to compare the effect of the two different MR sequences jointly on sensitivity and specificity, we considered a bivariate generalized linear mixed model (GLMM) with Gaussian random effects. We adopted the logit–link, as it had the smallest Akaike information criterion (AIC) when compared to the probit and the complementary log–log link [30]. Summary results were also reported in terms of positive and negative likelihood ratio (LR+and LR−) and of diagnostic odds ratios (DOR), together with their 95 % confidence intervals (CI). P values based on the likelihood ratio test were provided (α=0.05, two-sided). We conducted sensitivity analyses by excluding retrospective investigations, studies with unclear or high risk of bias in the patient selection or in the flow and timing, QUADAS-2 domains, and outliers. We also evaluated the robustness of the model with respect to different selections of the readers. We performed a graphical evaluation of reporting bias using a funnel plot specifically designed for reviews of diagnostic test accuracy, representing the log(DOR) of each study against the inverse of the square root of the effective sample size: ESS=4n1n2/(n1 +n2) [31], where n1 and n2 are the number of subjects without and with myometrial invasion, respectively. We then performed the test on asymmetry proposed by Deeks [31] on the slope of a ESS-weighted regression of 1 log(DOR) against pffiffiffiffiffiffi , which is a modified version of ESS Sterne’s test [32]. All the analyses were performed with SAS 9.3 (SAS Institute, Cary NC) and graphs produced using R 2.15.3.

1330

Results Search results The electronic research provided a total of 7,579 citations. The flow chart summarizing the selection process is shown in Fig. 1. After removal of duplicate records, 5,740 citations were left. Of these, 5,724 were discarded because it was clear from the title or abstract that they Fig. 1 Flow diagram of the study selection process

Eur Radiol (2014) 24:1327–1338

did not meet either the subject of the review or the inclusion criteria. We examined the full text of the remaining 16 articles. Seven studies [33–39] were discarded because they did not meet the inclusion criteria (see Fig. 1 for details) and the remaining nine studies were the object of the review and meta-analysis [40–48]. No additional studies were found from references cited in the papers included in the review or in the six narrative reviews on the role of diffusion-

Eur Radiol (2014) 24:1327–1338

weighted image in gynaecological or endometrial cancer retrieved during the search [14–16, 49–51].

Methodological quality of included studies The graphical display of the evaluation of the risk of bias and concerns regarding applicability of the selected studies, according to the predefined criteria, are reported in Fig. 2a, b.

1331

Concerning the domain of patients selection bias, two studies did not explicitly report that the patients were consecutives [43, 45], but at least in the report by Rechichi et al., it was a matter of under-reporting. Similarly, in two studies [40, 45] it was not explicitly reported that the MR sequences of both the index and the comparator test were interpreted without knowledge of the results of the surgical-pathological evaluation. All studies adopted the same threshold to define deep myometrial invasion (i.e. invasion deeper than 50 % of

Fig. 2 Methodological evaluation according to QUADAS-2 of the included studies a overall and b by study

1332

Eur Radiol (2014) 24:1327–1338

myometrial thickness) and prespecified it. Four studies detailed the procedure adopted to measure myometrial invasion at MR imaging [40, 41, 45, 46]. Regarding applicability concerns for the index test, all studies met the predefined criteria for DW imaging, i.e. used at least one low and one high b value. The highest used b value was either 800 or 1,000 s/mm2 in all but one study [43], which used 500 s/ mm2. For contrast-enhanced sequences, two studies did not have enough information to evaluate the required minimum criteria of at least one acquisition between 120 and 180 s [41, 45]. There were no concerns about reference standard applicability, as all studies applied pathological evaluation of the removed uterus. Nevertheless, six studies did not explicitly state that pathological evaluation of deep myometrial invasion was blinded to MR results, but deficiency in reporting also has to be considered a potential explanation in this case [40, 41, 44, 46–48]. For the flow and timing domain, three studies were omitted from the 2×2 table of diagnostic results data of one [42] and two patients [43, 44], because of insufficient quality of DW imaging. These patients had either hip prostheses or were obese and should probably have been excluded from the study samples, as is often the case in pelvic MR studies, where such cases represent relative contraindications. One study reported a high number of patients (n=11) whose examinations were not evaluated because patients were either claustrophobic or unable to cooperate [43] and the examination was prematurely interrupted. Finally, two studies did not report the interval between MR examination and surgery [40, 44] and two admitted an interval longer than 40 days between examination and surgery [41, 48]. One additional concern was the absence of description of the adopted consensus procedure in two studies stating that two readers evaluated all imaging, but reporting only one set

of true positive (TP), false negative (FN), false positive (FP) and true negative (TN) results [40, 46]. Overall, none of the nine studies included in the systematic review showed such methodological flaws as to be excluded from the meta-analysis.

Characteristics of included studies A total of 442 patients, recruited in academic diagnostic studies in periods ranging from August 2005 to March 2012, were included in the final analyses. Six out of the nine studies [40, 42, 43, 46–48] were prospective. The main patient characteristics of primary studies are reported in Table 1. The mean age was similar across studies (range 52–65 years). Only three studies provided the percentage of menopausal women (Hori et al. [48], 76 %; Dogan et al. [46], 100 %; and Rechichi et al. [43], 77 %). Endometrioid tumours were prevalent, with percentages between 75 % and 94 %, values comparable with those reported in the FIGO 2006 annual report (87 %) [52]. The distribution of grading varied greatly among studies, with G1 ranging from 25 % to 78 %. No patients underwent medical oncological therapy between MR and surgery, and the median maximum interval between MR imaging and surgery was 21 days (I–III quartiles, 18– 52 days). The technical aspects of the MR protocols are detailed in Table 2. Four studies used 3T MR [42, 46–48], four used 1.5T MR [40, 43–45] and one study used both, but with a prevalence of 1.5 T [41]. All studies used a surface coil and channels varied from 4 to 32; one study did not report the number of channels [47]. Three out of the four studies performed fused T2 and DW imaging at 3 T to interpret DW [42, 47, 48], while four studies read together T2 and DW imaging to interpret DW [41, 44, 45], and the remaining two interpreted DW imaging alone [43, 46].

Table 1 Clinical characteristics of patients in the included studies Study

Shen et al. 2008 Takeuchi et al. 2009 Lin et al. 2009 Rechichi et al. 2010 Beddy et al. 2012 Ren et al. 2012 Dogan et al. 2013 Seo et al. 2013 Hori et al. 2013 a

Number of patients

Myometrial invasion

Age (years)

n

n ( %)

Mean

21 33 48 47 48 94 28 52 72

4 (19) 13 (39) 7 (15) 13 (28) 17 (35) 32 (34) 7 (33) 6 (12) 19 (27)

56 57 57 63 65 54 63 52 58

All non-endometrioid tumours are included in the G3 group

Endometrioid

Grading (%)

Min–max

(%)

G1

G2

G3

33–82 24–85 25–80 36–84 43–89 39–70 46–87 29–75 31–82

90 91 90 94 75 – 79 94 82

43 78 62 28 40 – 25 56 48

48 9 19 49 25 – 43 22 22

9 13a 19 23 35 – 32 22a 30a

GD GD GD – – Gadodiamide Gadobutrol Gadodiamide – 256×320 256×196 288×192 – 112×125 208×206 320×192 – 200 395 360 – 210 250 200

Two MR systems with different magnetic field strengths were used, but the majority of patients were imaged at 1.5 T

Contrast was at the dose of 0.1 mmol/kg in all the studies b

a

FOV field of view, TSE turbo spin-echo, SS single-shot, GE gradient echo, GD gadopentetate dimeglumine

– 256×224 –

8 4 2 4 – 4 5 4

Diagnostic performance of DW and CE MR for deep myometrial invasion detection

– GE 3D GE 3D GE 3D GE 3D GE 3D GE 3D GE

3D GE

Concerning potential pitfalls for MR imaging, only four studies (44 %) described benign pathology found at MR examination [41–44]. Three works (33 %) reported the use of antispastic drugs before examination, one study reported that no preparation was employed and the other five papers did not provide any information. No adverse events were reported during MR examinations.

No Yes No No No No Yes Yes

No 128×256

1333

128×192 128×128 144×75 128×128 128×128 112×125 128×108 96×96 400 200 400 280 380 210 350 200

240 5

8 4 6 4.5 4 4 5 4 800 1,000 500 800 800 1,000 1,000 1,000 – 256×320 368×215 384×256 256×256 112×125 800×690 512×512 – 200 220 240 300 210 360 200

1,000 512×256 200

1.5–3a 3 1.5 1.5 1.5 3 3 3 Takeuchi et al. 2009 Lin et al. 2009 Rechichi et al. 2010 Beddy et al. 2012 Ren et al. 2012 Dogan et al. 2013 Seo et al. 2013 Hori et al. 2013

8 4 3 5 4 4 4 4

TSE

TSE – TSE TSE TSE SS TSE TSE

1.5 Shen et al. 2008

5

Matrix FOV (mm) Thickness (mm)

5

Matrix FOV (mm) Thickness (mm) Type Max b value Type

Thickness (mm)

Diffusion-weighted MR T2-weighted MR Tesla Study

Table 2 Technical characteristics of the MR protocols used in the included studies

FOV (mm)

Matrix

T2 fused

Contrast-enhanced MR

Contrastb

Eur Radiol (2014) 24:1327–1338

Individual study results are shown by type of MR sequence and in chronological order on the forest plots in Fig. 3. The uncertainty in sensitivity estimates seems to be due to the combined effect of the low prevalence of myometrial invasion (ranging from 12 to 39 %, Table 1) and the small sample sizes. Significant evidence of heterogeneity was found for specificity, but not for sensitivity, both for diffusion-weighted imaging (Q=9.1, p=0.77 for sensitivity and Q=41.2, p