Genomic Prediction of Locoregional Recurrence ... - Semantic Scholar

13 downloads 140 Views 682KB Size Report
Oct 1, 2006 - Skye H. Cheng, Cheng-Fang Horng, Mike West, Erich Huang, Jennifer Pittman, Mei-Hua Tsou, ...... Ginos MA, Page GP, Michalowicz BS, et al:.
VOLUME

24



NUMBER

28



OCTOBER

1

2006

JOURNAL OF CLINICAL ONCOLOGY

O R I G I N A L

R E P O R T

Genomic Prediction of Locoregional Recurrence After Mastectomy in Breast Cancer Skye H. Cheng, Cheng-Fang Horng, Mike West, Erich Huang, Jennifer Pittman, Mei-Hua Tsou, Holly Dressman, Chii-Ming Chen, Stella Y. Tsai, James J. Jian, Mei-Chin Liu, Joseph R. Nevins, and Andrew T. Huang From the Departments of Radiation Oncology, Research, Laboratory and Pathology, Surgery and Medical Oncology, Koo Foundation Sun Yat-Sen Cancer Center, Taipei, Taiwan; Departments of Radiation Oncology, Surgery, Medicine, and Biostatistics and Bioinformatics, Duke University Medical Center; and the Institute of Statistics and Decision Sciences, and the Institute for Genome Sciences and Policy, Duke University, Durham, NC. Submitted May 2, 2005; accepted July 31, 2006. Supported by Synpac; the Koo Foundation Sun Yat-Sen Cancer Center Research Fund; and by National Science Foundation (US) Grants No. NSF DMS-0102227 and 0112340. Authors’ disclosures of potential conflicts of interest and author contributions are found at the end of this article. Address reprint requests to Skye H. Cheng, MD, Department of Radiation Oncology, Koo Foundation Sun Yat-Sen Cancer Center, No. 125, Lih-Der Road, Pei-Tou District, Taipei, Taiwan; e-mail: [email protected]. © 2006 by American Society of Clinical Oncology 0732-183X/06/2428-4594/$20.00

A

B

S

T

R

A

C

T

Purpose This study aims to explore gene expression profiles that are associated with locoregional (LR) recurrence in breast cancer after mastectomy. Patients and Methods A total of 94 breast cancer patients who underwent mastectomy between 1990 and 2001 and had DNA microarray study on the primary tumor tissues were chosen for this study. Eligible patient should have no evidence of LR recurrence without postmastectomy radiotherapy (PMRT) after a minimum of 3-year follow-up (n ⫽ 67) and any LR recurrence (n ⫽ 27). They were randomly split into training and validation sets. Statistical classification tree analysis and proportional hazards models were developed to identify and validate gene expression profiles that relate to LR recurrence. Results Our study demonstrates two sets of gene expression profiles (one with 258 genes and the other 34 genes) to be of predictive value with respect to LR recurrence. The overall accuracy of the prediction tree model in validation sets is estimated 75% to 78%. Of patients in validation data set, the 3-year LR control rate with predictive index more than 0.8 derived from 34-gene prediction models is 91%, and predictive index 0.8 or less is 40% (P ⫽ .008). Multivariate analysis of all patients reveals that estrogen receptor and genomic predictive index are independent prognostic factors that affect LR control. Conclusion Using gene expression profiles to develop prediction tree models effectively identifies breast cancer patients who are at higher risk for LR recurrence. This gene expression– based predictive index can be used to select patients for PMRT. J Clin Oncol 24:4594-4602. © 2006 by American Society of Clinical Oncology

INTRODUCTION

DOI: 10.1200/JCO.2005.02.5676

Breast cancer is a heterogeneous disease resulting from the acquisition of probably multiple somatic mutations which, in combination, define the characteristics of the tumor.1,2 Patients even within the same clinical stage carry a different risk of locoregional (LR) recurrence and distant metastasis. In clinical practice, the strategy to reduce LR recurrence is to use postoperative radiotherapy, whereas the strategy to diminish distant metastasis is to use systemic adjuvant chemotherapy. It is generally accepted that patients with involvement of four or more axillary lymph nodes should be treated with postmastectomy radiotherapy (PMRT).3 Whether patients with fewer than four positive nodes should be treated with PMRT remains controversial,4 although three large, ran-

domized control trials have proven the benefit of PMRT in node-positive patients.5-7 The LR recurrence rate at 10 years in node-negative patients is less than 5%, and for one to three nodes approximately 13%.8 Therefore, at least 87% of the patients would potentially be free from LR recurrence after mastectomy and would not require PMRT, whereas those at risk would potentially benefit from it. Because uncertainties continue to prevail about the effectiveness of PMRT in patients with one to three positive nodes,4 recent progress in genomic analysis as a potential tool for evaluating tumor biology opens a new possibility to improve risk stratification that would eventually lead to more personalized prognostication for this subset of patients.9 We and others have reported that gene expression signatures in breast cancer are associated with tumor phenotype, axillary lymph node invasion, and distant metastasis.10-12 However, the association

4594

Information downloaded from www.jco.org and provided by KOO FOUNDATION on September 28, 2006 from 61.218.51.18. Copyright © 2006 by the American Society of Clinical Oncology. All rights reserved.

Genomic Prediction of LR Recurrence in Breast Cancer

between gene expression profiles and LR recurrence has thus far not been defined in patients after mastectomy. The identification of individuals at risk for LR failure is crucially important because accurate prediction of failure patterns will immediately influence adjuvant treatment decisions. Given the complexity of breast cancer, it would be surprising if single genes or small combinations of genes could describe and ultimately predict the clinical course of the disease. Our previous work developed the concept of “metagene,” a subgroup of genes that are clustered together because of their similarity in gene function or their sharing the same pathway. This concept also has been demonstrated by others.13 Also, the 70-gene expression signature related to distant metastasis in breast cancer reported in the literature actually included unrelated sets of genes of equal contributions to the prediction of survival.14 That observation signifies further the importance of the metagene concept. We also use classification and regression tree analysis as a mechanism to sample from these metagenes to build predictive models that can best predict the clinical outcome. The logic in the approach is conceptually simple, recognizing the limitation of any one profile to go beyond a broad categorization into low risk versus high risk, and thus making use of multiple profiles to further dissect subgroups based on prediction of risk. In this study, we aim to identify clusters of genomic signatures meaningful for LR recurrence and to identify individuals at low recurrence risk for whom PMRT can be avoided. PATIENTS AND METHODS Patients One hundred fifty-eight patients with invasive breast cancer were collected in a series of collaborative studies between Duke University Medical Center (DUMC; Durham, NC) and Koo Foundation Sun Yat-Sen Cancer Center (KF-SYSCC; Taipei, Taiwan) for identification of gene expression profiles that were associated with axillary lymph node involvement and tumor recurrence.12,15 The present study focused on identifying gene expression profiles that relate to LR control after mastectomy. Ninety-four of them were enrolled in this study. Eligible patients should have either any LR recurrence (n ⫽ 27), or no evidence of LR recurrence without PMRT after a minimum of 3 years of follow-up (n ⫽ 67). Exclusion criteria were those who had breastconserving surgery (n ⫽ 17), PMRT (n ⫽ 36), and follow-up less than 3 years (n ⫽ 11). The LR recurrent tumor sites included 14 on the chest wall, one in axilla, one in the internal mammary chain, six in supraclavicular fossa, and five in multiple sites. Samples and Microarray Analysis The 94 frozen tissue samples came from the surgical specimens of the primary tumor taken from patients before treatment. These tissue samples matched the prospectively collected database from the patients enrolled in this study in the period between 1990 and 2001. The institutional review boards of Duke University and KF-SYSCC approved this study. Total RNA was extracted from tumor tissues with Qiagen RNEasy kits (Venlo, the Netherlands), and assessed for quality with an Agilent (Palo Alto, CA) Lab-on-a-Chip 2100 Bioanalyzer. Hybridization targets (probes for hybridization) were prepared from total RNA according to standard Affymetrix (Santa Clara, CA) protocols. Statistical Analysis to Identify Gene Expression Profiles of LR Recurrence The strategy and process to identify gene clusters that associated with LR recurrence after mastectomy are shown in Figure 1. The ninety-four patients were then stratified by clinical risk factors (tumor size and axillary lymph node metastasis) and randomly split 2:1 into a training data set (n ⫽ 62) and a validation set (n ⫽ 32). Clinical characteristics of patients in the training and validation data sets were not significantly different (Table 1).

Fig 1. Strategy and process to identify gene clusters associated with locoregional control. LRR, locoregional recurrence.

Our previous study identified 496 metagene clusters12; each metagene represented the key common pattern of expression of the genes in a cluster based on k-means clustering. These represented subsets of potentially related genes, which rendered the accuracy of recurrence prediction more robust. We then used the logistic regression method to examine the significance of each metagene individually in differentiating patients with and without LR recurrence in the training data set. We then used classification trees and Bayesian statistical methods as previously described12 to explore multiple metagenes for optimal prediction. The analysis entailed the successive partitioning of patient samples—and by inference the populations they represented—into more and more homogeneous subgroups, and the association and estimation of survival distributions within each subgroup. The metagenes as genetic predictors were in many aspects similar to the 70-gene predictor that classified breast cancer into “good” and “poor” signature, with metagenes separating homogenous outcome groups into subsets, and then into further subsets of subsets, always refining the risk on the way.16 The statistical test used was a Bayes factor test that is generally conservative relative to standard significance tests and so tends to generate less elaborate trees than traditional tree programs.17,18 The Bayes factor in this study was 2.9, which corresponded approximately to probability of 0.95. The growth of classification trees was terminated when no additional metagene could be selected that allowed a significant further split. Multiple possible splits generated collections of trees, and each was then formally evaluated based on statistical fit to the data. Multiple trees were generated automatically by MATLAB software (The MathWorks Inc, Natick, MA). Each classification tree generated predictions for future patients: A new patient was assigned to a unique subset of any one classification tree based on her genomic profiles and other factors, with the corresponding prediction of recurrence 4595

www.jco.org

Information downloaded from www.jco.org and provided by KOO FOUNDATION on September 28, 2006 from 61.218.51.18. Copyright © 2006 by the American Society of Clinical Oncology. All rights reserved.

Cheng et al

Table 1. Clinical Characteristics of All Patients and Patients With and Without LR Recurrence Validation Samples (n ⫽ 32)

Training Samples (n ⫽ 62)

Characteristic Follow-up, months Median Range Age, years ⱕ 40 41-50 ⬎ 50 Tumor size, cm ⱕ 2.0 ⬎ 2.0 Positive axillary nodes 0 1-3 ⱖ4 Estrogen receptor status Negative Positive Lymphovascular invasion Absent/focal Prominent Nuclear grade 1 2 3

No LRR (n ⫽ 44)

Any LRR (n ⫽ 18)

No LRR (n ⫽ 23)

60 6-115

Any LRR (n ⫽ 9)

␹2 P

66 8-133 .94

13 11 20

5 3 10

4 7 12

3 4 2

20 24

6 12

9 14

3 6

15 26 3

2 10 6

9 14 0

1 3 5

10 34

10 8

4 19

5 4

34 10

9 9

12 11

4 5

6 15 23

3 2 13

3 13 7

1 1 7

.68

.97

.21

.07

.28

Abbreviations: LR, locoregional; LRR, LR recurrence.

based on the model-based probability at that subset. Finally, overall predictions were based on averaging across the collection of candidate tree models. This aggregation relied on a weight for each classification tree that was a posterior probability based on the fit of the tree to the data compared with all other trees. The averaging was critical in delivering more robust and reliable predictions, and properly accounting for modeling uncertainty, compared with approaches that would just select one tree model. The 62 training samples as mentioned herein were used in this stage for classification tree generation and model build-up. Leave-one-out cross validation was performed to assess robustness of the model building process. After significant metagenes were identified by the aforementioned methods, we then examined these metagenes by unsupervised two-dimensional cluster analysis of the 62 samples, and calculated the correlation coefficient of the expression of each gene with LR recurrence. We selected significant genes (Pearson correlation coefficient ⬍ ⫺0.3 or ⬎ 0.3) and clustered them again for final classification tree generation and model build-up. Subsequently, we used the validation data set (n ⫽ 32) to examine the prediction tree models independently. On the basis of the models, each individual in the validation set would have her own probability of recurrence-free status (predictive index) and corresponding prediction uncertainty.

RESULTS

Training Prediction Tree Model Using Multiple Metagene Signatures We measured the association of metagenes in 62 patients in a forward-split process as implemented in traditional classification-tree

approaches; several classification trees were then generated. The final tree models used seven metagenes with a total of 258 genes and eight classification trees to build the LR control predictions. The overall predictions were based on averaging prediction probability across the collection of candidate trees. To provide an initial indication of robustness and accuracy, we evaluated the predictive probability of LR control for individual patients of the training data set using leave-oneout cross validation; the tree model process was recomputed repeatedly, each time leaving out one sample and then predicting it based on the rest. Patients with good LR control generally had high predictive index, and patients with LR recurrence had low predictive index. Clinical decisions are intended to be philosophically more conservative and tend toward overtreating patients. On that basis, we chose the optimal cutoff value in a probability of 0.8 on the receiver operating characteristic curve. The overall accuracy of these predictions is 87%, with estimated sensitivity of 100% and specificity 69%. The 3-year LR control probability in patients with predictive index more than 0.8 and 0.8 or lower is 100% and 42% (P ⬍ .0001), respectively (Table 2). Unsupervised two-dimensional cluster analysis of the 258 genes in 62 training samples is shown in Figure 2. This hierarchical clustering algorithm clustered 62 tumors on the basis of their similarities measured over the 258 significant genes. Similarly, the 258 genes were clustered on the basis of their similarities measured over the 62 tumors. Two distinct groups of tumors were the dominant features in this two-dimensional display, suggesting that patients with or without LR recurrence could be partitioned on the basis of 258gene signatures. Subsequently, 34 genes were identified from 258 genes using Pearson correlation coefficient (⬍ ⫺0.3 or ⬎ 0.3). We clustered these 34 genes into six clusters. These gene clusters were then used for classification-tree generation and model build-up again. The 3-year LR control probability in patients with predictive index derived from the 34-gene prediction tree models more than 0.8 and 0.8 or lower is 100% and 32% (P ⬍ .0001), respectively (Table 2). Validating the Prediction Tree Models To properly assess out-of-sample predictive accuracy based on data in our cohort, we validated the 258- and 34-gene prediction tree models in the remaining 32 independent samples. Figure 3 demonstrates LR control probability in both models. Using 258-gene model, the LR control probability at 3 years for patients with predictive index more than 0.8 was 95% (95% CI, 85% to 100%) and predictive index 0.8 or lower was 46% (95% CI, 19% to 73%; P ⫽ .006). Similarly, using

Table 2. Predictive Index of the Tree Models and Estimates of 3-Year LR Control Probability in Training Samples (n ⫽ 62)

Predictive Index

LR 3-Year LR Patient LR RecurControl No. Control rence Probability

258-gene prediction model ⬎ 0.8 ⱕ 0.8 34-gene prediction model ⬎ 0.8 ⱕ 0.8

P

36 26

36 8

0 18

100% 42%

⬍ .0001

40 22

40 4

0 18

100% 32%

⬍ .0001

Abbreviation: LR, locoregional.

4596

JOURNAL OF CLINICAL ONCOLOGY

Information downloaded from www.jco.org and provided by KOO FOUNDATION on September 28, 2006 from 61.218.51.18. Copyright © 2006 by the American Society of Clinical Oncology. All rights reserved.

Genomic Prediction of LR Recurrence in Breast Cancer

Fig 2. Unsupervised two-dimensional cluster analysis of 258 genes in 62 patients revealed two distinct groups of tumors; their locoregional recurrence rates were 41.4% (12 of 29) compared with 18.2% (six of 33). Patients with locoregional recurrence were colored as blue in top dendrogram.

34-gene model, the difference of 3-year LR control probability in patients with predictive index more than 0.8 and 0.8 or lower was statistically significant at 91% (95% CI, 79% to 100%) and 40% (95% CI, 10% to 70%; P ⫽ .008). The overall accuracy of prediction in the validation data set by the 258-gene model was 75% (95% CI, 58% to 87%) with sensitivity of 78% (95% CI, 45% to 94%) and specificity 74% (95% CI, 54% to 87%). The overall accuracy of prediction in the validation data set by the 34-gene model was 78% (95% CI, 61% to 89%) with sensitivity of 67% (95% CI, 35% to 88%) and specificity 83% (95% CI, 63% to 93%). Three patients with LR recurrence were not predicted correctly; their clinical characteristics were in Table 3.

Partitioning 94 Patients by Predictive Index According to the prediction models derived from 34 genes, the 3-year LR control rates between patients with the predictive index more than 0.8 and 0.8 or lower were statistically different, regardless of whether they were node negative or node positive (all P ⬍ .05; Table 4). The predictive index could partition patients with 3 or fewer positive nodes for planning PMRT. For patients with predictive index 0.8 or lower, the risk of LR recurrences is high. Approximately two thirds (14 of 22) did recur. Cox Proportional Hazards Model in All 94 Patients Subsequently, we examined whether the predictive index derived from the 34-gene model is an independent prognostic factor. 4597

www.jco.org

Information downloaded from www.jco.org and provided by KOO FOUNDATION on September 28, 2006 from 61.218.51.18. Copyright © 2006 by the American Society of Clinical Oncology. All rights reserved.

Cheng et al

were identified from the remaining 27 genes that were well represented, such as cell death, cell cycle and proliferation, DNA replication and repair, and immune response. These genes involved oncogenic process (BLM, TCF3, RCHY1, PTI1),19,20 proliferation (TPX2), cell cycle regulation (CCNB1, GPS2, FYN),21-23 cell-cell interaction (CMAH),24 cell morphology (CLCA2), and immune response (CCR1).25 DISCUSSION

Fig 3. Kaplan-Meier survival estimates for locoregional control in validation data set by (A) 258-gene and (B) 34-gene prediction tree models. Blue survival curves are patients with the predictive index more than 0.8; green curves are patients with the predictive index 0.8 or lower. The differences between these two subgroups from both prediction models are statistically significant.

Traditional proportional hazards analysis has established and quantified the prognostic relevance of clinical factors including the extent of lymph node metastasis and estrogen receptor status with respect to LR control. We analyzed the full 94 patient samples via proportional hazards modeling, including these clinical factors together with the 34-gene-based predictive index. This analysis confirms the significance of the genomic predictor as associated with LR control in the context of these traditional clinical variables. With the incorporation of the prediction tree models in the proportional hazards analysis, the hazard ratio for LR recurrence is 22 (95% CI, 6.0 to 81) in patients with predictive index 0.8 or lower (Table 5). Key Genes Related to LR Control Table 6 gave details of 34 most significant genes; seven of 34 genes were of unknown function. Several pathways or biochemical activities

The present study demonstrates the capacity of utilizing clusters of gene expression profiles to refine patient risk stratification and to define subgroups of greater homogeneity according to lymph node status (Tables 2 and 4) that will aid in identifying patients most likely to benefit from PMRT. The gene expression– based classification tree models accurately predict and distinguish patients according to risk; moreover, the models provide individualized risk estimates. For nodenegative patients and those patients with one to three positive nodes with predictive index more than 0.8, the LR control rate is 97% (56 of 58); and the predictive index less than 0.8, the LR control rate drops to only 36% (8 of 22; Table 4). Three patients with LR recurrence were predicted incorrectly (Table 3). The first woman was 69 years old; she did not seek medical attention for at least 6 months and had primary tumor size of 3.5 cm and 24 axillary-lymph-node metastases. She refused adjuvant treatments and developed LR recurrence 7 months after surgery. We do not know whether her genomics are intrinsically of low risk or whether the delayed diagnosis and treatments are the main reasons for recurrence. In clinical practice, she would have undergone PMRT no matter the result of genomic prediction. The third patient developed LR recurrence more than 8 years after mastectomy, her genomic prediction was of low risk. The second patient is a more clear-cut prediction failure by our 34-gene model; she was only 35 years old. Gene expression profiles that predict distant metastases in breast cancer have been reported previously.11,16,26 The current study focuses on gene expression profiles that predict the risk of LR recurrence. The prediction of both types of recurrence is crucial in clinical practice because different treatment strategies are required. Our work here identifies seven metagene clusters that include 258 genes associated with tumor-specific immune response, proliferation, apoptosis, cell communication, and so on. These observations are concordant with findings related to LR recurrence in head and neck cancer.27 Using correlation coefficient analysis, we have further identified 34 most significant genes that are associated with LR recurrence, the accuracy of prediction for LR recurrence is similar to that predicted by

Table 3. Patients With LR Recurrence and Prediction Failure Patient No.

Age (years)

Tumor Size (cm)

LN⫹

ER

1 2 3

69 35 41

3.5 1.5 3.5

24/25 0/18 1/23

Positive Negative Positive

LVI

Predictive Index by 34 Genes

Recurrent Interval

Prominent Focal Prominent

0.88 0.98 0.98

7 months 6 months 8 years, 1 month

Abbreviations: LR, locoregional; LN⫹, patients with positive lymph nodes; ER, estrogen-receptor status; LVI, lymphovascular invasion.

4598

JOURNAL OF CLINICAL ONCOLOGY

Information downloaded from www.jco.org and provided by KOO FOUNDATION on September 28, 2006 from 61.218.51.18. Copyright © 2006 by the American Society of Clinical Oncology. All rights reserved.

Genomic Prediction of LR Recurrence in Breast Cancer

Table 4. Predictive Index by 34-Gene Prediction Models Partitioning Patients Into Different Risk Subgroups According to Lymph Node Status

Predictive Index Node-negative ⬎ 0.8 ⱕ 0.8 P 1-3 positive nodes ⬎ 0.8 ⱕ 0.8 P ⱖ 4 positive nodes ⬎ 0.8 ⱕ 0.8 P

3-Year LR Control Probability

3-Year Recurrence-Free Probability

3-Year Overall Survival Probability

1 2

96% 33% .027

79% 31% .34

100% 67% .32

34 19

1 12

100% 47% ⬍ .0001

90% 37% ⬍ .0001

93% 79% .029

4 10

1 10

75% 0% .017

37% 23% .043

65% 33% .19

Patient No.

No. of LR Recurrence

24 3

Abbreviation: LR, locoregional.

258 genes. This phenomenon has been observed by others using 70-gene predictor for distant metastasis.14 The aim of this study is to explore whether gene expression profiles could aid in improving stratification of clinical low-risk patients into more homogeneous subgroups, especially in patients with one to three positive nodes. Optimal sensitivity and specificity of the prediction model is desirable in order to avoid subjecting truly highrisk patients to suboptimal treatment and truly low-risk patients to overtreatment. Achieving such goal appears to be possible. The current prediction models derived from 34 genes classify more patients into high-risk group if they have more axillary lymph-node involvements (Table 4).

Table 5. Cox Proportional Hazards Model in All Patients (N ⫽ 94) Variable Analysis without prediction model Estrogen receptor Positive Negative Lymph nodes Negative, No. 1-3 ⱖ4 Analysis with prediction model (34 genes) Estrogen receptor Positive Negative Lymph nodes Negative, No. 1-3 ⱖ4 Predictive index ⬎ 0.8 ⱕ 0.8

Hazard Ratio

95% CI

P

1.0 2.5

1.1 to 5.7

.035

1.0 1.8 10.5

0.5 to 6.3 2.7 to 41

.39 .0007

1.0 3.4

1.1 to 11

.04

1.0 0.8 2.3

0.2 to 3.1 0.4 to 13.2

.75 .33

1.0 22

6.0 to 81

⬍ .0001

In clinical practice, decisions about assigning breast cancer patients with one to three involved axillary nodes to PMRT remain controversial. Recently a phase III study with a follow-up of 20 years has demonstrated that PMRT could reduce not only LR recurrence, but also distant metastasis. Patients with the unfortunate event of LR recurrence usually were associated with secondary microscopic distant metastases.28 Salvage treatments were not possible to cure these patients. Therefore, identifying high-risk patients for prevention of LR recurrence after mastectomy becomes essential. The present study suggests that node-negative patients and those with one to three positive nodes can be sorted into more homogeneous subgroups by the novel gene expression– based prediction models (Table 4). Nodenegative patients with a predictive index more than 0.8 and 0.8 or lower have a 3-year LR control rate of 96% and 33%, respectively (P ⫽ .027). Similarly, patients with one to three positive nodes and a predictive index more than 0.8 and 0.8 or lower have a 3-year LR control rate of 100% and 47%, respectively (P ⬍ .0001). Although the overall accuracy, sensitivity and specificity are encouraging, further validation is clearly needed. Nevertheless, the consistency of performance of the predictive model in both test set and cross-validation training set is very positive. This suggests that the prediction model does have the potential to assist clinicians to make decisions for patients who have a substantial risk of LR recurrence. This model should be further refined by enrolling larger numbers of patients and by having it tested for the impact of PMRT on recurrence and survival. In summary, the predictive index derived from the gene expression– based prediction tree model is an independent factor that is significantly associated with LR recurrence in breast cancer after mastectomy. Gene expression profiles are capable of improving the partitioning of node-positive patients into more homogeneous subgroups. A larger validation study in these patients is warranted to confirm the broader value of this genomic predictor and its value in improving health care via more individualized prediction of treatment outcomes.

4599

www.jco.org

Information downloaded from www.jco.org and provided by KOO FOUNDATION on September 28, 2006 from 61.218.51.18. Copyright © 2006 by the American Society of Clinical Oncology. All rights reserved.

Cheng et al

Table 6. Function of 34-Gene Predictors Associated With Locoregional Control Accession No.

Symbol

U30255

PGD

Phosphogluconate dehydrogenase

L29217 M31523

CLK3 TCF3

CDC-like kinase Transcription factor 3 (E2A immunoglobulin enhancer binding factors E12/E47)

AI051683 AB018551 U39817

C16orf7 BLM

Chromosome 16 open reading frame 7 Bloom syndrome

AL050144

RCHY1

Ring finger and CHY zinc finger domain containing 1

W26805 L41498

PTI1

Homo sapiens elongation factor 1-alpha 1

AB026833

CLCA2

Chloride channel, calcium activated, family member 2

AB014557 AB024704

OBSL1 TPX2

Obscurin-like 1 TPX2, microtubule associated

X83877

TRPV6

Transient receptor potential cation channel, subfamily V, member 6

D86324

CMAH

D43948 U90426

CKAP5 DDX39

Cytidine monophosphate-N-acetylneuraminic acid hydroxylase (CMP-Nacetylneuraminate monooxygenase) Cytoskeleton associated protein 5 DEAD (Asp-Glu-Ala-Asp) box polypeptide 39

AI138834 U20240

CEBPG

AI356682 AB023137 D10925

CCR1

PALM2-AKAP2 protein Chemokine (C-C motif) receptor 1

AI553878 AA152202 M14333

FYN

FYN oncogene related to SRC, FGR, YES

AF053306

BUB1

BUB1 budding uninhibited by benzimidazoles 1 homolog beta (yeast)

X71874

PSMB10

X99906

ENSA

Proteasome (prosome, macropain) subunit, beta type, 10 Endosulfine alpha

X57346

YWHAB

M74558 AI052224 U28963

STIL

Tyrosine 3-monooxygenase/tryptophan 5monooxygenase activation protein, beta polypeptide SCL/TAL1 interrupting locus

GPS2

G protein pathway suppressor 2

D86331

MMP15

Matrix metallopeptidase 15

M25753 M16750

CCNB1 PIM1

Cyclin B1 pim-1 oncogene

Gene Name

CCAAT/enhancer binding protein (C/EBP), gamma

Function

Position

6-phosphogluconate dehydrogenase is the second dehydrogenase in the pentose phosphate shunt Serine/threonine protein kinases Perhaps all t(1;19)(q23;p13) chromosomal translocations; the most frequent cytogenetic change in acute lymphoblastic leukemia; contain rearrangements of the E2A gene Unknown Adenosine triphosphate synthase; subunit b-like Autosomal recessive disorder characterized by proportionate pre- and postnatal growth deficiency; sun-sensitive, telangiectatic, hypo- and hyperpigmented skin; predisposition to malignancy and chromosomal instability Oncogenic because loss of p53 function contributes directly to malignant tumor development Unknown A class of oncogenes that could affect protein translation and contribute to carcinoma development in human prostate and other tissues Plays a role in the complex pathogenesis of cystic fibrosis; may also serve as adhesion molecule for lung metastatic cancer cells, mediating vascular arrest and colonization Unknown Nuclear proliferation-associated protein whose expression is restricted to cell cycle phases S, G2, and M Calcium-permeable channels, such as TRPV6, participate in neurotransmission, muscle contraction, and exocytosis by providing calcium as an intracellular second messenger. Sialic acids are terminal components of the carbohydrate chains of glycoconjugates involved in ligand-receptor, cellcell, and cell-pathogen interactions Colonic and hepatic tumor overexpressed protein Involved in embryogenesis, spermatogenesis, and cellular growth and division Unknown The C/EBP family of transcription factors regulates viral and cellular CCAAT/enhancer element-mediated transcription Unknown Protein coding A member of the beta chemokine receptor family, critical for the recruitment of effector immune cells to the site of inflammation Unknown Unknown Protein-tyrosine kinase oncogene family. It encodes a membrane-associated tyrosine kinase that has been implicated in the control of cell growth. A kinase involved in spindle checkpoint function; impaired spindle checkpoint function has been found in many forms of cancer A multicatalytic proteinase complex

1p36.3-p36.13

A highly conserved cAMP-regulated phosphoprotein (ARPP) family; candidate gene for type 2 diabetes Mediate signal transduction by binding to phosphoserinecontaining proteins, linking mitogenic signaling and the cell cycle machinery Increased mitotic activity in tumor cells Unknown Involved in G protein-MAPK signaling cascades; when overexpressed in mammalian cells, this gene could potently suppress a RAS- and MAPK-mediated signal and interfere with JNK activity, suggesting that the function of this gene may be signal repression Involved in the breakdown of extracellular matrix in normal physiological processes, such as embryonic development, reproduction, and tissue remodeling, as well as in disease processes, such as arthritis and metastasis A regulatory protein involved in mitosis The proto-oncogene PIM1 encodes a protein kinase upregulated in prostate cancer

15q24 19p13.3

16q24 15q26.1

4q21.1

1p31-p22

2q35 20q11.2 7q33-q34

6p22-p23

11p11.2 19p13.12

19q13.11

9q31-q33 3p21

6q21

15q15

16q22.1 1q21.2 20q13.1

1p32 17p13

16q13-q21

5q12 6p21.2

Abbreviation: MAPK, mitogen-activated protein kinase.

4600

JOURNAL OF CLINICAL ONCOLOGY

Information downloaded from www.jco.org and provided by KOO FOUNDATION on September 28, 2006 from 61.218.51.18. Copyright © 2006 by the American Society of Clinical Oncology. All rights reserved.

Genomic Prediction of LR Recurrence in Breast Cancer

REFERENCES 1. Jones C, Ford E, Gillett C, et al: Molecular cytogenetic identification of subgroups of grade III invasive ductal breast carcinomas with different clinical outcomes. Clin Cancer Res 10:5988-5997, 2004 2. Bange J, Zwick E, Ullrich A: Molecular targets for breast cancer therapy and prevention. Nat Med 7:548-552, 2001 3. Recht A, Edge SB, Solin LJ, et al: Postmastectomy radiotherapy: Clinical practice guidelines of the American Society of Clinical Oncology. J Clin Oncol 19:1539-1569, 2001 4. Whelan T, Levine M: More evidence that locoregional radiation therapy improves survival: What should we do? J Natl Cancer Inst 97:82-84, 2005 5. Overgaard M, Hansen PS, Overgaard J, et al: Postoperative radiotherapy in high-risk premenopausal women with breast cancer who receive adjuvant chemotherapy: Danish Breast Cancer Cooperative Group 82b trial. N Engl J Med 337:949955, 1997 6. Overgaard M, Jensen MB, Overgaard J, et al: Postoperative radiotherapy in high-risk postmenopausal breast-cancer patients given adjuvant tamoxifen: Danish Breast Cancer Cooperative Group DBCG 82c randomised trial. Lancet 353:1641-1648, 1999 7. Ragaz J, Jackson SM, Le N, et al: Adjuvant radiotherapy and chemotherapy in node-positive premenopausal women with breast cancer. N Engl J Med 337:956-962, 1997 8. Recht A, Gray R, Davidson NE, et al: Locoregional failure 10 years after mastectomy and adjuvant chemotherapy with or without tamoxifen without irradiation: Experience of the Eastern Cooperative Oncology Group. J Clin Oncol 17:1689-1700, 1999

20. Kamps MP, Murre C, Sun XH, et al: A new homeobox gene contributes the DNA binding domain of the t(1;19) translocation protein in pre-B ALL. Cell 60:547-555, 1990 21. Pines J, Hunter T: Isolation of a human cyclin cDNA: Evidence for cyclin mRNA and protein regulation in the cell cycle and for interaction with p34cdc2. Cell 58:833-846, 1989 22. Jin DY, Teramoto H, Giam CZ, et al: A human suppressor of c-Jun N-terminal kinase 1 activation by tumor necrosis factor alpha. J Biol Chem 272: 25816-25823, 1997 23. Semba K, Nishizawa M, Miyajima N, et al: Yes-related protooncogene, syn, belongs to the protein-tyrosine kinase family. Proc Natl Acad Sci U S A 83:5459-5463, 1986 24. Irie A, Koyama S, Kozutsumi Y, et al: The molecular basis for the absence of N-glycolylneuraminic acid in humans. J Biol Chem 273:15866-15871, 1998 25. Nomura H, Nielsen BW, Matsushima K: Molecular cloning of cDNAs encoding a LD78 receptor and putative leukocyte chemotactic peptide receptors. Int Immunol 5:1239-1249, 1993 26. Glinsky GV, Higashiyama T, Glinskii AB: Classification of human breast cancer using gene expression profiling as a component of the survival predictor algorithm. Clin Cancer Res 10:2272-2283, 2004 27. Ginos MA, Page GP, Michalowicz BS, et al: Identification of a gene expression signature associated with recurrent disease in squamous cell carcinoma of the head and neck. Cancer Res 64:55-63, 2004 28. Ragaz J, Olivotto IA, Spinelli JJ, et al: Locoregional radiation therapy in patients with high-risk breast cancer receiving adjuvant chemotherapy: 20year results of the British Columbia randomized trial. J Natl Cancer Inst 97:116-126, 2005

9. Nevins JR, Huang ES, Dressman H, et al: Towards integrated clinico-genomic models for personalized medicine: Combining gene expression signatures and clinical factors in breast cancer outcomes prediction. Hum Mol Genet 12:R153-R157, 2003 10. Perou CM, Sorlie T, Eisen MB, et al: Molecular portraits of human breast tumours. Nature 406:747752, 2000 11. van’t Veer LJ, Dai H, van de Vijver MJ, et al: Gene expression profiling predicts clinical outcome of breast cancer. Nature 415:530-536, 2002 12. Huang E, Cheng SH, Dressman H, et al: Gene expression predictors of breast cancer outcomes. Lancet 361:1590-1596, 2003 13. Chang HY, Nuyten DS, Sneddon JB, et al: Robustness, scalability, and integration of a woundresponse gene expression signature in predicting breast cancer survival. Proc Natl Acad Sci U S A 102:3738-3743, 2005 14. Ein-Dor L, Kela I, Getz G, et al: Outcome signature genes in breast cancer: Is there a unique set? Bioinformatics 21:171-178, 2005 15. Pittman J, Huang E, Dressman H, et al: Integrated modeling of clinical and gene expression information for personalized prediction of disease outcomes. Proc Natl Acad Sci U S A 101:8431-8436, 2004 16. van de Vijver MJ, He YD, van’t Veer LJ, et al: A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med 347:1999-2009, 2002 17. Kass R, Raftery A: Bayes factors. J Am Stat Assoc 90:773-795, 1995 18. Pittman J, Huang E, Nevins J, et al: Bayesian analysis of binary prediction tree models for retrospectively sampled outcomes. Biostatistics 5:587601, 2004 19. Ellis NA, Groden J, Ye TZ, et al: The Bloom’s syndrome gene product is homologous to RecQ helicases. Cell 83:655-666, 1995 ■ ■ ■

Acknowledgment We thank the members of Breast Cancer Team and staff in Clinical Protocol Office for patient care, data quality control, and outcome analysis. Authors’ Disclosures of Potential Conflicts of Interest Although all authors completed the disclosure declaration, the following authors or their immediate family members indicated a financial interest. No conflict exists for drugs or devices used in a study if they are not being evaluated as part of the investigation. For a detailed description of the disclosure categories, or for more information about ASCO’s conflict of interest policy, please refer to the Author Disclosure Declaration and the Disclosures of Potential Conflicts of Interest section in Information for Contributors. Authors

Employment

Leadership

Consultant

Stock

Honoraria

Skye H. Cheng

Research Funds

Testimony

Other

Koo Foundation Sun Yat-Sen Cancer Center (B)

Mike West

Synpac (B)

Joseph R. Nevins Andrew T. Huang

Synpac (B) Synpac (A) Dollar Amount Codes

(A) ⬍ $10,000

(B) $10,000-99,999

(C) ⱖ $100,000

(N/R) Not Required

4601

www.jco.org

Information downloaded from www.jco.org and provided by KOO FOUNDATION on September 28, 2006 from 61.218.51.18. Copyright © 2006 by the American Society of Clinical Oncology. All rights reserved.

Cheng et al

Author Contributions Conception and design: Skye H. Cheng, Mike West, Joseph R. Nevins, Andrew T. Huang Financial support: Andrew T. Huang Administrative support: Skye H. Cheng, Mei-Hua Tsou, James J. Jian, Joseph R. Nevins, Andrew T. Huang Provision of study materials or patients: Skye H. Cheng, Chii-Ming Chen, Stella Y. Tsai, James J. Jian, Mei-Chin Liu Collection and assembly of data: Skye H. Cheng, Cheng-Fang Horng, Erich Huang, Mei-Hua Tsou, Holly Dressman, Joseph R. Nevins Data analysis and interpretation: Skye H. Cheng, Cheng-Fang Horng, Mike West, Jennifer Pittman Manuscript writing: Skye H. Cheng, Mike West, Erich Huang, Jennifer Pittman, Joseph R. Nevins, Andrew T. Huang Final approval of manuscript: Skye H. Cheng, Cheng-Fang Horng, Mike West, Erich Huang, Jennifer Pittman, Mei-Hua Tsou, Holly Dressman, Chii-Ming Chen, Stella Y. Tsai, James J. Jian, Mei-Chin Liu, Joseph R. Nevins, Andrew T. Huang

4602

JOURNAL OF CLINICAL ONCOLOGY

Information downloaded from www.jco.org and provided by KOO FOUNDATION on September 28, 2006 from 61.218.51.18. Copyright © 2006 by the American Society of Clinical Oncology. All rights reserved.