Phosphoproteomics of Primary Cells Reveals Druggable ... - Cell Press

30 downloads 153 Views 7MB Size Report
Mar 28, 2017 - variability among samples and among proteins of the same sam- ... dant sites in EOC (cluster C) (Figures 4A and S3A; Table S3). In total, $10% ...
Resource

Phosphoproteomics of Primary Cells Reveals Druggable Kinase Signatures in Ovarian Cancer Graphical Abstract

Authors Chiara Francavilla, Michela Lupia, Kalliopi Tsafou, ..., Lars J. Jensen, Ugo Cavallaro, Jesper V. Olsen

Correspondence [email protected] (C.F.), [email protected] (U.C.), [email protected] (J.V.O.)

In Brief Francavilla et al. use mass-spectrometrybased phosphoproteomics as a powerful tool to reveal cancer signatures. They analyze changes in the proteome and phosphoproteome of primary cells derived from epithelial ovarian cancer (EOC) compared to healthy tissues and reveal a role for the kinase CDK7 in EOC cell proliferation.

Highlights d

We analyze ex-vivo-cultured primary cells using phosphoproteomics

d

We investigate epithelial ovarian cancer (EOC) and healthy tissue

d

We uncover expression of cancer-specific proteins and kinase signatures

d

The kinase CDK7 phosphorylates POLR2A and regulates EOC cell proliferation

Francavilla et al., 2017, Cell Reports 18, 3242–3256 March 28, 2017 ª 2017 The Author(s). http://dx.doi.org/10.1016/j.celrep.2017.03.015

Accession Numbers PXD003531

Cell Reports

Resource Phosphoproteomics of Primary Cells Reveals Druggable Kinase Signatures in Ovarian Cancer Chiara Francavilla,1,6,7,* Michela Lupia,2,6 Kalliopi Tsafou,3,6,8 Alessandra Villa,2,9 Katarzyna Kowalczyk,5 Rosa Rakownikow Jersie-Christensen,1 Giovanni Bertalot,4 Stefano Confalonieri,4 Søren Brunak,3 Lars J. Jensen,3 Ugo Cavallaro,2,* and Jesper V. Olsen1,10,* 1Proteomics Program, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen, Denmark 2Unit of Gynecological Oncology Research, Program of Gynecological Oncology, European Institute of Oncology, Via Ripamonti 435, 20141 Milan, Italy 3Disease Systems Biology Program, Novo Nordisk Foundation Center for Protein Research, Faculty of Health and Medical Sciences, University of Copenhagen, Blegdamsvej 3B, 2200 Copenhagen, Denmark 4Program of Molecular Medicine, European Institute of Oncology, Via Ripamonti 435, 20141 Milan, Italy 5Division of Molecular and Cellular Functions, School of Biological Sciences, Faculty of Biology, Medicine and Health, the University of Manchester, Manchester M13 9PL, UK 6Co-first author 7Present address: Division of Molecular and Cellular Functions, School of Biological Sciences, Faculty of Biology, Medicine and Health, the University of Manchester, Manchester M13 9PL, UK 8Present address: Department of Oncology, Lombardi Comprehensive Cancer Center, Georgetown University, Washington, DC 20057, USA 9Present address: Philochem AG, Otelfingen, Switzerland 10Lead Contact *Correspondence: [email protected] (C.F.), [email protected] (U.C.), [email protected] (J.V.O.) http://dx.doi.org/10.1016/j.celrep.2017.03.015

SUMMARY

Our understanding of the molecular determinants of cancer is still inadequate because of cancer heterogeneity. Here, using epithelial ovarian cancer (EOC) as a model system, we analyzed a minute amount of patient-derived epithelial cells from either healthy or cancerous tissues by single-shot mass-spectrometry-based phosphoproteomics. Using a multi-disciplinary approach, we demonstrated that primary cells recapitulate tissue complexity and represent a valuable source of differentially expressed proteins and phosphorylation sites that discriminate cancer from healthy cells. Furthermore, we uncovered kinase signatures associated with EOC. In particular, CDK7 targets were characterized in both EOC primary cells and ovarian cancer cell lines. We showed that CDK7 controls cell proliferation and that pharmacological inhibition of CDK7 selectively represses EOC cell proliferation. Our approach defines the molecular landscape of EOC, paving the way for efficient therapeutic approaches for patients. Finally, we highlight the potential of phosphoproteomics to identify clinically relevant and druggable pathways in cancer. INTRODUCTION The characterization of molecular determinants of cancer has advanced tremendously in the past decades, mainly due to advancements in high-throughput technologies. Deep

sequencing approaches at the gene expression level can now be complemented by proteomics. Mass spectrometry (MS)based proteomics has undergone enormous improvements in the past few years because of more accurate instrumentation and better methods for sample preparation and quantitation (Aebersold and Mann, 2016). Proteomics has enabled the analysis of the expressed protein complement of cells and entire tissues, but it can also analyze post-translationally modified proteins (i.e., phosphorylated proteins [phosphoproteomics]) (Beck et al., 2011; Kim et al., 2014; Lundby et al., 2013; von Stechow et al., 2015; Wilhelm et al., 2014). As proteins and especially phosphoproteins regulate the functional properties of cells (e.g., a phosphorylation site can reflect the activity state of a protein), phosphoproteomics has been employed for the identification of potential pharmaceutical targets (Dias et al., 2015). Furthermore, the implementation of proteomics in combination with genomics to study cancer is now emerging (Mertins et al., 2016; Zhang et al., 2016). To identify tumor-associated signatures, onco-proteomics studies have investigated cell lines, mice xenografts, or entire biopsies (Geiger et al., 2012; Guo et al., 2015; Ntai et al., 2016; Pozniak et al., 2016; Zhang et al., 2016). However, cell lines recapitulate biology of neoplastic cells within an actual tumor only to a certain extent, and other cell types may contaminate mouse xenografts or entire tissues. In the latter case, discriminating what is tumor specific from the contribution of cells of the tumor microenvironment may be difficult. To overcome this issue, one possibility is to analyze a purer cancer population (i.e., primary cancer cells), as demonstrated for T cells (Mitchell et al., 2015) or endothelial cells (van den Biggelaar et al., 2014). Systemwide phosphoproteomics studies of patient-derived-primary cells are, however, underrepresented.

3242 Cell Reports 18, 3242–3256, March 28, 2017 ª 2017 The Author(s). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

A

Primary Tumor

B

C

D

(legend on next page)

Cell Reports 18, 3242–3256, March 28, 2017 3243

Here, we compared the phosphoproteome of human primary epithelial cells derived from either neoplastic or healthy tissue, thus focusing specifically on pure cell populations of different origin. Besides overcoming the possible contamination of tumor cells with other cell types, our strategy based on primary cells can add new information to the recently published Clinical Proteomic Tumor Analysis Consortium (CPTAC) database, where the molecular profile of the tumors was not compared with their normal counterpart (Zhang et al., 2016). We selected epithelial ovarian cancer (EOC) because it is the most lethal gynecological tumor in developed countries. EOC is a heterogeneous disease of which high-grade serous EOC is the most common and lethal form (Gurung et al., 2013; Kurman and Shih, 2016). Besides mutations in tumor suppressor genes, few recurrent somatic mutations have been associated with EOC (Cancer Genome Atlas Research Network, 2011; Jones and Drapkin, 2013; Kurman and Shih, 2016). Furthermore, both the ovarian surface epithelium (OSE) and the distal fallopian tube epithelium (FTE) can give rise to high-grade serous EOC (Bowtell et al., 2015). Therefore, defining the molecular landscape of highgrade serous EOC and healthy ovarian tissues is challenging. Primary ex vivo cultures of human FTE have been shown to be reliable model for serous ovarian carcinogenesis (Levanon et al., 2010). Here, we performed quantitative phosphoproteomics of ex-vivo-cultured human ovarian epithelial cells and included cells derived either from OSE and FTE or from highgrade serous EOC to get a more accurate picture of ovarian cancer behavior. We developed a rapid and accurate method to analyze less than 1 mg protein in 90-min run-on last-generation mass spectrometers, the Q-Exactives (Kelstrup et al., 2012). Finally, we integrated quantitative phosphoproteomics and bioinformatics analyses with biochemical assays and immunohistochemistry (IHC) to validate our findings. Our comprehensive phosphoproteomics analysis revealed differentially expressed proteins and activated proteins between healthy and pathological samples, thus providing the ovarian cancer community with a valuable resource to better understand the biology of such a complex disease. These results complement and extend the recently published study from the CPTAC investigators, where the proteogenomic analysis of more than 100 high-grade serous carcinomas has revealed novel pathways to stratify patients (Zhang et al., 2016). The main differences between that approach and ours include the fact that the CPTAC consortium did not analyze the normal counterparts of EOC (i.e., OSE and FTE), and we have used pure cultures of primary cells as opposed to whole tumor tissues. Moreover, we have used complementary technologies (i.e., iTRAQ versus label-free methods to quantify changes in protein abundance and post translational modifications [PTMs]). In spite of these differences, we have

added new information to the published CPTAC dataset, underscoring the importance of integrating multiple technologies and approaches to study cancer signatures. Our strategy also revealed novel kinase-mediated functional signatures in EOC. This may pave the way for innovative therapeutic approaches for EOC patients, given that only two drugs having been licensed for EOC treatment in the last five years (Symeonides and Gourley, 2015). To illustrate the power of our approach for quantifying changes in human primary cells and for identifying targetable kinases in cancer, we focused on the role of cyclin-dependent kinase 7 (CDK7), and implicated this signaling pathway in EOC cell proliferation. RESULTS Proteomics of Patient-Derived Cells Unveils Differentially Expressed Proteins in FTE, OSE, and EOC Our first goal was to assess if the ex vivo culture of patientderived epithelial cells would be a good model system for proteomics analysis of EOC cells and of their normal counterpart. To this end, we compared the molecular profile of high-grade serous EOC with those of non-neoplastic FTE and OSE. The proteome and phosphoproteome of patient-derived primary cells from FTE, OSE, and EOC were analyzed by high-resolution quantitative mass spectrometry (Figure 1A; Table S1). Morphological examination and immunofluorescence staining for specific markers confirmed the epithelial nature of primary cells as well as the absence of other contaminating cell types (Figure S1). We extracted 700 mg protein from each cell culture, of which 1% was used for proteome profiling and the rest for enrichment of phosphorylated peptides by TiO2-based chromatography (Figure 1A, bottom). We quantified 5,561 proteins and 7,658 distinct phosphorylation sites using single-run liquid chromatography-tandem mass spectrometry (LC-MS/MS) measurements from 13 independent patient samples (four OSE, four EOC, and five FTE) (Figure 1B; Tables S2 and S3). The distribution of phosphorylated peptides and sites was in line with previous studies (Francavilla et al., 2013) (Figures S2A–S2D), and the coverage of protein and phosphorylated peptides among technical replicates (samples 1, 2, and 4) was high and reproducible (Figures S2E–S2G; Table S1). Unsupervised clustering separated both proteome (Figure S2H) and phosphoproteome (Figure 1C) according to the origin of primary cells. Moreover, the abundance of protein and phosphorylated peptides estimated by their mass spectrometry signal intensities was reproducible with Pearson correlation coefficients above 0.9 for samples of the same origin (Figures 1D and S2I). These findings together confirmed the high reproducibility of our samples. On the contrary, we found a poor correlation (R = 0.14–0.30) between the

Figure 1. Proteomics of Patient-Derived Epithelial Cells Reveals Good Coverage of Phosphorylated Sites (A) Workflow of the proteomics analysis of epithelial cells derived from FTE, OSE, and EOC (see also Figure S1, Table S1, and Supplemental Experimental Procedures). (B) Numbers of identified proteins (blue) and phosphorylated sites (pink) in the 13 samples analyzed in this study. Data are presented as mean ± SD. (C) Unsupervised clustering of the phosphoproteome dataset shows separation between tumor and healthy samples. (D) Heatmap of the Pearson’s correlation (R2) of the phosphoproteome data shows good overall reproducibility among samples deriving from the same tissue (highlighted by black boxes). Numbers and letters indicate individual samples or technical replicates, respectively, according to Table S1. See also Figures S1 and S2 and Tables S1, S2, and S3.

3244 Cell Reports 18, 3242–3256, March 28, 2017

A

C PC 2

OSE

2

27 4

EOC

FTE

29

36

13

16

EOC 17

1

8

9

kDa

MUC16/ CA125

- 220

vinculin

- 120

3 27 1 17

PC 1 15

FTE

14

13

OSE 26

28

DCBLD2

29

D

B

log Intensity -4

4

28 27 29 26 OSE

4

3 2 EOC

1

PHGDH

- 57

KCTD12

- 36

3

*

2

*

*

1

0

)

15

E(

OS

) ) 34 (85 s is ( rou ma tas Se cino tas ri e a M oc

)

18

E(

FT

en

ad

Cluster 1: up in EOC

3

*

2

*

1

0

)

18

E(

OS

) ) 34 (86 s is ( rou ma tas Se cino tas ri e a M oc

)

18

E(

FT

en

ad Cluster 2: up in FTE and EOC

D

Cluster 3: up in FTE and OSE Cluster 4: up in OSE

PRSS8* MUC16* PTMA TYMP GBP1 NMI MED17 PRKAG1 TNRC6B GRIPAP1 HLA-B PTGS1 ENO2 MICAL3 LYPLA2 NDRG1 CBFB IMPDH1 PHGDH UBXN7 SAMHD1 PTMS KCTD12 H1F0 H1FX CDKN2AIPNL HMGN2 C16orf13 DIRAS3 RNF113A DDHD2 MYO1D DCBLD2 ZC3H18 MRPS27 LRRN4 VMP1 PIK3C2A LSS NCAPG COL8A1 NrCAM HMGB2 LRRC8C DDB2 HIST1H1A CCNH RAB9A RAB27B ABAT

17 14 16 15 13 FTE

- 120

PHGDH (mean of the staining intensity)

16

KCTD12 (mean of the staining intensity)

PC 3

(legend on next page)

Cell Reports 18, 3242–3256, March 28, 2017 3245

abundance of phosphorylated peptides and that of their corresponding protein (Figure S2J), suggesting that the analysis of both proteome and phosphoproteome is necessary to derive a tissue-specific signature. We first focused on the analysis of the proteome by performing 3D principal-component analysis (3D-PCA) of all the 5,561 identified proteins (Table S2; Figure 2A). The analysis clustered individual samples according to the tissue of origin and separated the proteome of EOCs from that of the two healthy tissues (Figure 2A). The LIMMA Bioconductor package, which accounts for variability among samples and among proteins of the same sample (Wettenhall and Smyth, 2004), identified 49 differentially expressed proteins among the three cell types (Table S2). Hierarchical clustering separated these proteins in four main clusters (Figure 2B; Table S2). Proteins that were specifically enriched in EOC cells were represented in cluster 1; those present in both FTE and EOC, but not OSE, were represented in cluster 2; proteins more abundant in non-neoplastic cells were represented in cluster 3; and those specific to OSE were represented in cluster 4 (Figure 2B; Table S2). To validate our proteomics approach, we selected proteins found in different clusters. Cluster 1 contained the known EOC marker MUC16 (or CA125) (Neunteufel and Breitenecker, 1989). The D-3-phosphoglycerate dehydrogenase PHGDH involved in serine biosynthesis (Luo, 2011) and the auxiliary subunit of GABA-B receptors KCTD12 (Cathomas et al., 2015) were chosen to represent cluster 2. Finally, the transmembrane neuropilin-like protein DCBLD2 (Kobuke et al., 2001) belonged to cluster 3. With the exception of MUC16, none of these proteins have previously been associated with EOC. Western blot (WB) analysis in independent primary cell cultures confirmed the mass spectrometry results for all the proteins (Figure 2C), indicating the reliability of our approach. The levels of PHGDH and KCTD12 were also evaluated on a larger cohort of samples by IHC on tissue microarrays (TMAs). This analysis revealed higher expression of both proteins in a significant proportion of high-grade serous EOC, metastasis, and healthy FTE compared to OSE (Figure 2D). These data suggested that our approach based on quantitative proteomics of ex-vivo-cultured primary epithelial cells derived from one specific cell type is well suited to capture the characteristics of the original tissue, thus offering a reliable model for further molecular analysis of serous ovarian carcinogenesis. Phosphoproteomics of Patient-Derived Cells Identifies CDK7 and POLR2A in EOC As less than 1% of the proteome was differentially regulated in the three cell types (49 out of 5,561 proteins), we hypothesized

that specific biological differences among tissues reside within the signaling state of the proteomes, which can be probed by analyzing the phosphoproteome . We first performed 3D-PCA of all the 7,658 quantified phosphorylated sites (Table S3). The projections of the data on the plane of the first two and three principal components clustered individual samples according to the tissue of origin and clearly separated the phosphoproteome of EOCs from that of the two healthy tissues (Figure 3A), implying that different signaling pathways are activated in the three cell types. This idea was confirmed by KEGG (Kyoto Encyclopedia of Genes and Genomes) pathway and GO (Gene Ontology) analyses. In particular, we observed a strong enrichment for spliceosome components as well as the overrepresentation of proteins involved in cytoskeletal rearrangement in EOC compared to healthy tissues (Figures 3B–3D). Next, we analyzed the differentially regulated phosphorylated sites among the three cell types with the LIMMA Bioconductor package (see above and Supplemental Experimental Procedures). We found 309 sites abundant in OSE and FTE, but not in EOC (cluster A); 35 sites present in both FTE and EOC, but not OSE (cluster B); and 448 highly abundant sites in EOC (cluster C) (Figures 4A and S3A; Table S3). In total, 10% of the phosphoproteome (792 out of 7,658) was found to be differentially regulated in the three cell types. This proportion of regulated phosphorylation sites was consistent with previous data from large-scale phosphoproteomics of human cancer cells (Olsen and Mann, 2013). Furthermore, 495 of the phosphorylated proteins whose sites were differentially regulated among EOC, OSE, and FTE have been previously identified in ovarian tumors (Zhang et al., 2016) (Figure S3B). As shown in Figures 3B–3D, GO term analysis revealed a remarkable enrichment for spliceosome components in EOC (cluster C), confirming the results of PCA. Cell cycle and chromosome organization were GO terms enriched in both FTE and EOC (cluster B) (Figure 4B). Finally, signaling, cytoskeletal organization, and focal adhesion were enriched terms in healthy cell types, both FTE and OSE, as compared to EOC (cluster A) (Figures 4B and S3C). These findings support the differential activation of specific signaling networks in each of the three cell types. The specificity of many protein kinases is encoded in consensus sequences surrounding the serine/threonine residues that are phosphorylated. Thus, to identify the protein kinases more active in tumors than in normal tissue, we performed a linear sequence motif analysis of the differentially regulated phosphorylation sites. This analysis revealed a strong bias toward arginine and serine in close proximity to the phosphorylated sites in cluster C compared to the sites in cluster A (Figure 4C), suggesting higher activation of basophilic kinases in tumor cells.

Figure 2. Proteomic Analysis Discloses Differentially Expressed Proteins (A) 3D-PCA of all the proteins quantified in independent samples for each tissue. Missing values have not been considered. (B) Hierarchical clustering of the proteins differentially expressed in four independent samples derived from each of the three cell types. The identified clusters 1–4 are highlighted on the right and separated by a black line. Protein intensity is presented on the logarithmic scale with intensity below and above the mean colorcoded in blue and red, respectively. The proteins highlighted in the different clusters were selected for further validation. (C) Lysates from three independent samples derived from each tissue (see Table S1) were subjected to WB analysis with the indicated antibodies. Vinculin was used as loading control. The experiment was repeated twice with analogous results. (D) The two histograms show the intensity of the staining for PHGDH (top) and KCTD12 (bottom) in normal OSE, FTE, serous adenocarcinomas, and metastasis (number of samples: 15 OSE, 18 fimbriae, 84 serous adenocarcinomas, and 34 metastases). Data are presented as mean ± SD. *p < 0.001 compared to OSE (Wilcoxon test). See also Table S2.

3246 Cell Reports 18, 3242–3256, March 28, 2017

D

A

S527S320S318 PRPF38B S529P

PC 2 MCODE clustering: spliceosome

14

S215

MCODE clustering: actin cytoskeleton unclustered known phosphosite novel phosphosite

13

USP39

PRPF6

EOC 4

S715

3

S640 RPS6KA1

PC 1 PC 3 28

S6

RBMX S286 S145 S S S216 S290 ACIN1 1S742 S561A T131 63 3 S614 S633 SKP1 S771

S6

EPB41L1

DDX5 T340

S23 STUB1

BAIAP2 S1574 IITPR1 T S332

S239

B

-log p value 0

20

S1474 S1536 FLNB F S2044 S2509 S

pathways enriched in tumors

spliceosome

MAP3K1 S923

integrin signalling

S2144 S1630 0 F FLNA

tight junction

S2182

ITGA6 S1059

APC

PTK2 T442

PAK1 S144 S645

VAV2S769

CTNNB1 1 S191

S794

PTPN12

PRKCD

focal adhesion

S281 ZYX

MAP3K2 40

S1549

S326

ENAH

PKN2 T942

S224 S232 SNW1 1 UBR5

HNRNPA1

T378 S281

UBE2J1

PRPF40A S870 S867

S471

29

S266

S677 RBM25

S12

ABI2

S169 S151 SPTAN1 S177 TJP2 T S392

SF3A1

EIF4A3

S221

S1550

OSE

S329

HNRNPC

27

S448 S225 SART1 SNRNP200 S309 SF3B2

RBM8AS233S241 S

SLC9A1

RPS6KA3

15

26

S42

S693

S560 THOC1

T266

U2AF2

FTE 14

S619 PRPF3

S79

16 1

S82 S46

RBM17

found in regulated proteome

2

DHX16 S100

SLU7 S155

S8PARVA P

S425 S1225 PXN TLN1 S303 S1641 GRB10 S98

PLCG1 S1248 PARD3 S695 S259 S338 S934S916 PIK3C2A

AKT1 S124

VCL

ITPR3 S1832 S1843

FRS2 F S365

actin cytoskeleton angiogenesis

C

-log p value 0

3

6

RNA splicing mRNA processing cytoskeleton organization chromosome organization chromatin organization chromatin modification

GO Biological Processes enriched in tumors

actin cytoskeleton organization regulation of cytoskeleton organization actin filament-based process regulation of protein complex disassembly regulation of organelle organization posttranscriptional regulation of gene expression chromatin assembly or disassembly apoptosis establishment or maintenance of cell polarity macromolecular complex subunit organization regulation of protein complex disassembly cell cycle M phase protein localization

Figure 3. The Phosphoproteomes of EOC and Healthy Tissues Are Different (A) 3D-PCA of all the phosphorylated sites of independent samples for each tissue. Missing values have not been considered. (B and C) KEGG pathways analysis (B) and biological processes (GO terms BP) (C) enriched in EOC samples. The analysis included 1,026 sites. (D) Network of the phosphorylated sites of EOC samples belonging to the enriched pathways shown in (B), based on STRING and visualized in Cytoscape. The Cytoscape plug-in MCODE revealed enriched networks of phosphorylated proteins of the spliceosome and actin cytoskeleton. See also Table S3.

Cell Reports 18, 3242–3256, March 28, 2017 3247

The overrepresentation in EOC of substrate motifs for the basophilic kinases protein kinase A (PKA), protein kinase B (PKB, or AKT), and protein kinase C (PKC) confirmed this hypothesis (Figure 4D). We then focused on spliceosome components that were highly enriched in EOC (Figure 4B) and analyzed the 100 phosphorylated sites on spliceosome proteins for kinase preferences. A stronger enrichment for MAPK/CDK substrate sites and the general overrepresentation of proline-directed sites (64/100) against the 26 out of 100 sites enriched for basophilic kinases motives - were specifically observed in this subset of Cluster C (Figures 4E and 4F; Tables S4 and S5). We therefore wondered which kinases were activated in cancer tissues that could explain the enrichment for mitogen-activated protein kinase (MAPK)/cyclindependent kinase (CDK), but not basophilic kinase substrates, in this subgroup of cancer-specific proteins enriched for spliceosome components. We noticed that the proline-directed CDK7 was phosphorylated on a peptide covering the kinase domain activation loop in not only EOC but also FTE (Figures S3D and S3E; Tables S3 and S4). CDK7 therefore belonged to cluster B (Figures S3D and 4B). The fact that CDK7 regulates both cell-cycle-related events and polymerase II alpha (POLR2A)-mediated transcription (Fisher, 2005) may explain why cluster B was enriched in nuclear proteins and proteins with a role in the cell cycle (Figures S3D and 4B). Finally, although our analysis could not exclude a role for other kinases, the presence of activated CDK7 in EOC was consistent with the overrepresentation of proline-directed kinase substrates and with the functional category spliceosome among all the EOC sites (Figure 4). Of the 26 phosphorylated proteins belonging to the GO term spliceosome enriched in EOC (cluster C), 14 have previously been associated to spliceosome core machinery or splicing factors in cancer (Papasaikas et al., 2015) (Figure 4E). We also detected phosphorylated peptides covering the tandem sevenamino-acid C-terminal domain (CTD) repeats of POLR2A, whose phosphorylation is associated to the regulation of transcription and alternative splicing (Mun˜oz et al., 2010) (Figures 4E and S3F). Thus, we tested whether peptides in proteins with splicing variants were overrepresented in EOC proteome compared to the proteome of healthy cells. Mapping the 74,566 unique peptide sequences identified in our dataset to the UniProt protein database identified more than 50% of the peptides matching to proteins with known alternative splice variants (Figure S3G). We found a significant overrepresentation of peptides in proteins with splicing variants among EOC-regulated phosphorylated peptides (Figure S3H; Table S4), which suggests a higher degree of splicing in tumor cells compared to healthy tissues. Taken together, our results suggest crosstalk among CDK7 activation, POLR2A phosphorylation, and the spliceosome machinery in EOC (Figure S4). CDK7 Is Activated in Ovarian Cancer To test our hypothesis on the CDK7/POLR2A axis in EOC, we first verified the activation of CDK7 and POLR2A by checking for their phosphorylation status in primary cells and in a panel of EOC cell lines. Immunoblotting analysis showed that both CDK7 and POLR2A were phosphorylated in three independent cultures of EOC, but not (or very little) in its healthy counterparts (Figure 5A). The activation of CDK7 in EOC primary cells was also

3248 Cell Reports 18, 3242–3256, March 28, 2017

confirmed in six different ovarian cancer cell lines (Figure S5A). These findings validated our phosphoproteomics analysis, which had assigned activated CDK7 to cluster B (EOC and FTE; Figure S3D). Furthermore, IHC analysis on tissue biopsy specimens supported CDK7 phosphorylation only in EOC and FTE (Figure S5B). A similar pattern of active and total CDK7 was also observed in recurrent high-grade serous ovarian carcinoma from patients previously subjected to primary debulking surgery followed by carboplatin/paclitaxel chemotherapy (Figure S5C; Table S1). Figure S5C shows two representative samples of recurrent high-grade serous ovarian carcinoma out of the 14 samples analyzed and found positive for active CDK7. Network analysis of known CDK7-associated proteins highlighted cell-cycle regulation and RNA splicing as enriched biological functions (Figure S5D; Table S5). We also observed that 29% of the CDK7-associated proteins were related to pathological conditions, including EOC and other cancer types (Figure S5D; Table S5), which points to CDK7 as a potential drug target. Furthermore, this analysis revealed a sub-network of phosphorylated proteins centered on POLR2A, which were also identified in our phosphoproteomics dataset on primary cells (Figure S5E; Table S5). Therefore, our data confirmed the existence of a functional link between CDK7 and POLR2A (Kwiatkowski et al., 2014; Nilson et al., 2015). To validate the crosstalk of CDK7 with POLR2A and the CDK7, proline-directed signature in EOC, we performed a large-scale quantitative phosphoproteomics experiment in the EOC cell line OVCAR3. We selected this cell line because it recapitulates several biological feature of high-grade serous EOC, including TP53 mutations and substantial gene copy-number alterations (Domcke et al., 2013). Untreated cells were compared to cells treated with THZ1, a specific inhibitor of CDK7 that covalently binds to a conserved cysteine in the kinase domain of CDK7 (Kwiatkowski et al., 2014). The mitogen-activated protein kinase kinase (MEK) inhibitor U0126 (Duncia et al., 1998) was used as control (Figure S6A). The specific effects of THZ1 and U0126 treatment on the phosphorylation of POLR2A and extracellular signal-regulated kinases (ERK), respectively, were confirmed by WB analysis (Figure 5B). We quantified 13,194 phosphorylation sites using label-free quantitation (Table S6) and observed high reproducibility between biological replicates with Pearson correlation coefficients above 0.75 (Figure S6B). The quality of this phosphoproteomics dataset was comparable to that observed in primary cells, as assessed by high reproducibility of the independent biological replicates (Figures S6B–S6F). By hierarchical cluster analysis of the difference in phosphorylated peptide intensity between treated and untreated controls, we identified four main groups of phosphorylation sites: 1,896 sites affected by inhibition of CDK7, 3,149 sites affected by MEK inhibition, 4,781 sites affected by inhibition of both kinases, and 3,361 sites that were unaffected by either treatment (Figure 5C). As expected, linear sequence motif analysis of the amino acid sequence surrounding the phosphorylation sites revealed a significant overrepresentation of proline-directed sites among the MEK-associated proteins (Roskoski, 2012) (Figure 5D). We observed overrepresentation of proline-directed sites among CDK7-associated proteins as well, but these sites also have significant enrichment of arginine in 3 position (Figure 5D). The

A

B

-log pvalue 0

log Intensity 4

14 13

4

3 2 EOC

1

20

25

30

35

40

Cluster A: up-regulated in FTE/OSE Cluster B up-regulated in FTE/EOC Cluster C: up-regulated in EOC

C Cluster C

25

P-site

16 15 FTE

15

-6

-5

-4

-3

-2

-1

1

2

3

4

5

6

Cluster A: up in FTE and OSE

Cluster A

-25

D -log pvalue Substrate Motif Cluster C 0

5

10

15

20

25

30

35

40

45

PKC kinase Akt kinase PKA kinase MAPKAPK1 kinase 14-3-3 domain CDK1,2,4,6 kinase ZIP kinase CDK kinase ATM kinase Cluster B: up in FTE and EOC

E

sp

lic

Cluster C: spliceosome

eo

PRPF4

so

m

e

significant after PCA

co

re

Transcription Factors

co

m

PRPF4B

po

Kinases

ne

SLU7

nt

ABC MAPK/CDKs kinases motif

s

ABC Basophilic kinases motif

PRPF3

RBM25

ABC Both

U2AF2 THRAP3

Cluster C: up in EOC

SRRM1

CWC22

BUD13

TRA2B

HNRNPU

PNN

SNRNP200

HNRNPA1

RBM4

POLR2A

DDX5

HNRNPC

RBM8A

SRRM2

SRSF2

SRSF6

SRSF3

SRSF11

SRSF9

splicing factors

F

88

-6

-5

-4

-3

-2

-1

P-site

29 17

P value= 0.05 % difference

26 27 28 OSE

10

Spliceosome Chromatin organization Cell cycle Protein localization Cytoskeleton organization Chromosome organization ErbB signaling pathway Focal adhesion Ribosome

P value= 0.05 % difference

-4

5

1

2

3

4

5

6

-88

(legend on next page)

Cell Reports 18, 3242–3256, March 28, 2017 3249

dual RXXSP motif overrepresented among CDK7 targets in OVCAR3 cells was analogous to the enriched phosphorylation site sequence motif found among spliceosome components in patient-derived EOC (Figure 4F). The similarity with primary cell phosphoproteome was further confirmed by the overrepresentation of RNA splicing and spliceosome components among the CDK7-associated proteins. On the contrary, MEK-associated proteins were enriched for proteins involved in cell-cycle regulation (Figure 5E). We also quantified several CDK7-regulated phosphorylated peptides containing serine 5 (S5) in the CTD repeats of POLR2A (Figure 5F). The phosphorylation of S5 in the CTD repeats of POLR2A is crucial for POLR2A function (Kwiatkowski et al., 2014; Mun˜oz et al., 2010). Interestingly, many of these sites were identical to those found upregulated in EOC (Figure 5F; Tables S4, S5, S6, S7, and S8). Both a highly interconnected sub-network centered on POLR2A and several transcription factors and spliceosome components were found among the CDK7-associated proteins in OVCAR3 cells, with a significant fraction of phosphorylated proteins that overlapped with those identified in patient-derived EOC (Figures 5G and S6G; Table S6). Altogether, these findings confirmed the functional crosstalk among CDK7 activation, POLR2A phosphorylation, transcription, and spliceosome components in EOC cell lines and patient-derived cells. As CDK7 has recently emerged as a prominent target for treating cancer (Cao and Shilatifard, 2014; Christensen et al., 2014), we tested EOC sensitivity to THZ1 using five EOC cell lines (OVCAR3, SKOV3, HEYA8, COV362, and COV318). THZ1 treatment reduced the activation of POLR2A after 4-hr treatment as well as cell proliferation after 72-hr treatment in all cell lines (Figures 6A and 6B). The inhibition of phosphatidylinositol 3-kinase (PI3K), one of the few genes known to be mutated in ovarian cancer (Cancer Genome Atlas Research Network, 2011), with LY294002 was used as a control. Finally, THZ1 treatment did not affect the proliferation of HeLa, A549, or MCF7 cancer cells (Figure 6C), which are of cervix, lung, and breast cancer origin, respectively, implying that CDK7 inhibition is specific for EOC. To verify the causal link between CDK7 activation and POLR2A phosphorylation, we transfected OVCAR3, COV318, or COV362 cells with two different small interfering RNA (siRNA) sequences against CDK7. Ablation of CDK7 reduced dramatically the phosphorylation of POLR2A in all the cell lines (Figure 6D). Moreover, CDK7 knockdown also resulted in the inhibition of cell proliferation for all three cell lines (Figure 6E), thus supporting the role of CDK7 in EOC cell proliferation.

Altogether, our data show the specific activation of a CDK7/ POLR2A axis in EOC cells and implicate CDK7 in the regulation of EOC cell proliferation. DISCUSSION Our proteomics and phosphoproteomics approach, applied to a low amount of patient-derived epithelial cells, uncovered a previously unknown molecular signature of EOC, paving the way for a better understanding of EOC biology and for unique opportunities of therapeutic intervention. The analysis of less than 1 mg protein for each ovarian tissue reached a deep coverage of differentially expressed proteins and phosphorylation events in line with previous studies (for instance, Elschenbroich et al., 2011; Kim et al., 2008; Waldemarson et al., 2012). It is worth noting that we reached such coverage of signaling events under steady-state growing conditions. While on one hand this allows an unbiased analysis, on the other it does not address the possible effect of individual hormones or growth factors. We believe that this simple and robust protocol can easily be implemented in translational laboratories focusing on cancer signaling and phosphoproteomics. Soon, it may even be routinely used in the clinic and complement IHC analysis. As methods for sample preparation rapidly improve (Batth and Olsen, 2016; Kulak et al., 2014), mass spectrometry-based proteomics is now robust and sufficiently reproducible to allow large-scale analysis of clinical material (Guo et al., 2015). Phosphoproteomics in particular promises to become a powerful complementary technology to transcriptomics and single-cell RNA sequencing for the analysis of patient samples. This is due to the fact that the analysis of protein or mRNA abundance alone cannot always predict changes in the level of phosphorylated proteins and hence the activity state of cellular signaling networks. Deregulation of protein phosphorylation is a key driver of tumorigenesis; thus, the analysis of cancer phosphoproteomes is crucial not only for gathering information on cancer cell biology but also for drug discovery (Dias et al., 2015; von Stechow et al., 2015). Targeting signaling networks might emerge as the most effective personalized treatment for patients in the near future. In this study, phosphoproteomics of a specific cell type (patient-derived epithelial cells) resulted in the identification of a unique cancer signature that was also validated by IHC on whole sections and by TMA. Therefore, primary cells represent a useful in vitro model to recapitulate, at least to some extent, the histopathological complexity of cancer. We envision that, together

Figure 4. The EOC Phosphoproteome Is Enriched in Proteins Belonging to Spliceosome (A) Hierarchical clustering of the 792 phosphorylated sites differentially expressed in four independent samples derived from each of the three cell types. The three clusters termed A–C are highlighted on the right and separated by a black line. The intensity of phosphorylated sites is presented on the logarithmic scale with intensity below and above the mean color-coded in blue and red, respectively. (B) Biological Processes (GO term) enriched in each of the three clusters. (C) Sequence motif analysis of the ± six-amino-acid residues flanking the regulated phosphorylation site identified in cluster C compared to cluster A. (D) Kinases substrate motif enriched in cluster C. (E) Network of proteins belonging to the GO term spliceosome (cluster C; Table S4) based on STRING and visualized in Cytoscape. Proteins whose phosphorylated sites were found in the PCA analysis (Figure 3A) are represented with a pink border; transcription factors and kinases are color-coded in light blue and pink, respectively. The light blue and pink clouds surrounding two distinct groups of proteins are based on the comparison of this dataset with the analysis of splicing components shown in Papasaikas et al. (2015). The color of the text is based on the sequence motif analysis shown in (F). (F) Sequence motif analysis of the ± six-amino-acid residues flanking the 100 phosphorylation sites belonging to the GO term spliceosome enriched in cluster C. See also Figures S3 and S4 and Tables S4 and S5.

3250 Cell Reports 18, 3242–3256, March 28, 2017

Figure 5. Phosphoproteomics Identifies CDK7-Associated Proteins (A) Lysates from three independent samples derived from each tissue (see Table S1) were subjected to immunoblotting with the indicated antibodies. Tubulin was used as loading control. The experiment was repeated three times with analogous results. (B) OVCAR3 cells treated with DMSO (control), the CDK7 inhibitor THZ1, or the MEK inhibitor U0126 were subjected to immunoblotting with the indicated antibodies. THZ1, but not U0126, inhibits POLR2A phosphorylation. Tubulin was used as loading control. Lanes 1–3 represent independent biological replicates.

(legend continued on next page)

Cell Reports 18, 3242–3256, March 28, 2017 3251

with tumor xenograft models (Ricci et al., 2014), proteomics of patient-derived cells will be used to study the biology of EOC and other cancer types at an unprecedented molecular resolution to identify tumor-specific markers. Similar analyses need to be conducted on samples from tumor of different origin, thus improving our molecular understanding of tumorigenesis. EOC is most often diagnosed at a rarely curable late stage. Proteomic profiling of patient-derived samples may lead to the discovery of predictive markers that would guide the therapeutic decision-making process (Lee and Kohn, 2010). For example, PHGDH has been associated with cell proliferation (Du et al., 2010) and metabolic alterations in cancer (Luo, 2011). Here, we demonstrated that the high expression of PHGDH correlates within high-grade serous EOC (Figure 2). Thus, our data provide the rationale for testing the inhibition of PHGDH activity as a novel approach for EOC treatment (Pacold et al., 2016). Another relevant aspect of our analysis is the possibility to contribute to clarify the issue of EOC origin. Both the proteome and the phosphoproteome of primary cells exhibited clusters of hits in common between FTE and EOC, but no significant overlap was found between EOC and OSE (Figures 2, 3, and 4). These results, therefore, lend further support to the notion that at least a high proportion of high-grade serous EOC derives from the FTE (Bowtell et al., 2015). Our phosphoproteomics analysis pointed to the cyclindependent kinase CDK7 as a potential player in EOC development. In particular, we observed that the association of CDK7 phosphorylation with the activation of its target, RNA polymerase II (POLR2A), was a specific feature of patient-derived cancer samples and cancer cell lines (Figures 4, 5, and 6). Indeed, CDK7 phosphorylation was also detected also in FTE cells, although to a lesser extent, but no POLR2A activation was found. Mechanistically, it is possible that CDK7 activity in FTE was too low to allow for POLR2A phosphorylation. Alternatively, CDK7-mediated activation of POLR2A occurs in a tumor cell context-dependent manner. Regardless of the underlying mechanisms, these data suggest that the CDK7/POLR2A axis, rather than CDK7 activation alone, is involved in EOC development, and further research is warranted to elucidate how this axis influences the pathobiology of EOC. Our results might open a novel therapeutic window for the treatment of EOC, in line with recent studies reporting that blocking CDK7 with THZ1, a covalent inhibitor of CDK7 (Kwiatkowski et al., 2014), specifically killed triple-negative breast cancer cells (Wang et al., 2015). Perhaps CDK7-dependent phosphorylation of POLR2A (Figure 6) is responsible for cancer cell proliferation, in line with recent data linking POLR2A activation with colon

cancer (Liu et al., 2015). For example, the CDK7 inhibitor THZ1, which does not interfere with CDK7 phosphorylation (Kwiatkowski et al., 2014), can be combined with the blockade of transcriptional regulators (Asghar et al., 2015; Gonda and Ramsay, 2015). However, future studies should address how one can control for the dual role of CDK7 during cell cycle and activation of transcription (Fisher, 2005). The EOC-specific signature CDK7/POLR2A/spliceosome component is an attractive target for pharmacological intervention, as alternative splicing is a key element in gene expression and has been associated to diseases (Tazi et al., 2009). In the context of cutting-edge and multidisciplinary analysis of cancer signatures, determining changes in both the transcriptome and proteome will complement classical IHC studies, providing molecular biomarkers and targets for personalized treatments. Identifying cancer biomarkers by proteomics investigations, for example by quantitative phosphoproteomics of exvivo-cultured patient-derived primary cells, could lead to betterinformed decisions about treatment, which translates into real benefits for patients. EXPERIMENTAL PROCEDURES Tissue Samples All tissue samples were obtained upon informed consent from women (age 45–75 years) undergoing surgery at the Gynecology Division of the European Institute of Oncology (Milan) and collected via standardized operative procedures approved by the Institutional Ethical Board (European Institute of Oncology, Milano, Italy). Table S1 contains a list of the samples together with the patients’ diagnosis and the use of each sample in this study. Cell Culture To derive OSE and FTE cells, healthy ovarian cortical tissues and fimbriae were incubated with dispase and red blood cells were eliminated. EOC cells were derived either from peritoneal fluid (ascites) or from tumor biopsy specimens. All primary epithelial cells were cultured on collagen-I-coated plates for a maximum of three passages. All cell lines were purchased from ATCC and maintained in the indicated conditions. Cell Lysis and Assays After the indicated treatment, cell extraction and immunoblotting or cell proliferation assay were performed as described previously (Francavilla et al., 2013). Immunofluorescence Primary cells were fixed with 4% paraformaldehyde (PFA) and incubated with primary antibodies for 2 hr at room temperature. All secondary antibodies were incubated for 1 hr at room temperature, and nuclei were counterstained with DAPI. Coverslips were then mounted with Mowiol. Images were acquired with an OLYMPUS BX63 microscope (203 objective) and processed by the software Fiji.

(C) Hierarchical clustering of proteins differentially phosphorylated upon treatment with the CDK7 or MEK inhibitor. The four identified clusters are highlighted on the left and separated by a black line. Protein intensity is presented on the logarithmic scale with treated/control intensity below and above the mean coloredcoded in red and blue, respectively. (D) Sequence motif analysis of the ± six-amino-acid residues flanking the phosphorylated site identified among the CDK7-associated (top) or MEK-associated (bottom) proteins. (E) GO term biological processes enriched in the CDK7 or MEK cluster. (F) List of the phosphorylated sites of POLR2A identified in OVCAR3 that are CDK7 regulated and/or found in patient-derived cells. CTD, C-terminal domain of POLR2A. S2 and S5 refer to the position of the phosphorylated serine in the repetitive stretch of amino acids found in the CTD of POLR2A. (G) Network based on STRING and visualized in Cytoscape of the CDK7-associated proteins (from C). Only the proteins also found in patient-derived cells are shown on the left. For a complete list, see Figure S6 and Table S6. CDK7-associated proteins enriched in the POLR2A cluster are shown on the right. See also Figures S5 and S6 and Table S6.

3252 Cell Reports 18, 3242–3256, March 28, 2017

Figure 6. CDK7 Inhibition Affects the Proliferation of Ovarian Cancer Cells (A) Lysates from different ovarian cancer cell lines, either untreated or treated with DMSO, the PI3K inhibitor LY294002, or the CDK7 inhibitor THZ1 for 4 hr were subjected to immunoblotting with the indicated antibodies. Vinculin was used as loading control. The experiment was repeated three times with analogous results.

(legend continued on next page)

Cell Reports 18, 3242–3256, March 28, 2017 3253

IHC Staining The IHC analysis was carried out on four healthy ovaries, four healthy FTE, four primary EOC, and 14 recurrent EOC (Table S1). Immunostaining was performed on 3-mm sections from formalin-fixed, paraffin-embedded tissue samples. Dako EnVision+ System-HRP Labeled Polymer was used for detection, in combination with Dako chromogen substrate (Liquid DAB+ Substrate Chromogen System). Sections were counterstained with hematoxylin. Pictures of stained sections were acquired with the scanner Aperio ScanScope XT (203 objective). IHC staining was assessed by a trained pathologist (G.B.). TMA TMA analysis was carried out as previously described (Zecchini et al., 2008). Pictures of stained TMAs were acquired with the scanner Aperio ScanScope XT (203 objective). IHC scoring was performed by a trained pathologist (G.B.). Sample Preparation for Mass Spectrometry The pellet of primary cells or of OVCAR3 cells was dissolved in denaturation buffer, and 700 mg protein from each sample was analyzed. Proteins were digested with endoproteinase Lys-C and sequencing grade modified trypsin. Peptides were purified using reversed-phase Sep-Pak C18 cartridges and eluted with 50% acetonitrile. A small amount of the eluted peptides (1%) was taken for proteome analysis. The remaining peptides were used for the analysis of phosphoproteome as previously described (Jersie-Christensen et al., 2016).

SUPPLEMENTAL INFORMATION Supplemental Information includes Supplemental Experimental Procedures, six figures, and seven tables and can be found with this article online at http://dx.doi.org/10.1016/j.celrep.2017.03.015. AUTHOR CONTRIBUTIONS M.L. provided part of the primary samples described in Table S1 and performed experiments shown in Figures S1C and S5A–S5C. K.T. performed the analysis of mass spectrometry data shown in Figures 2A, 2B, 3A–3C, 4A, 4B, S3B–S3H, S4, S5D, and S5E. A.V. provided part of the primary samples described in Table S1 and performed the experiments shown in Figures 2D, S1A, S1B, and S5B. K.K. performed experiments shown in Figures 6D and 6E. R.R.J.-C. helped with the preparation of samples of all the mass spectrometry experiments. G.B. is the trained pathologist who evaluated all the IHC (Figures S5B and S5C) and TMA data (Figure 2D). S.C. performed the statistical analysis of IHC results. S.B. edited the manuscript. L.J.J. conceived the experiment shown in Figures S3G and S3H, edited the manuscript, and supervised K.P.T. U.C. supervised A.V. and M.L. C.F. generated and analyzed the data shown in remaining figures and supervised K.K. and R.R.J.-C. C.F., U.C., and J.V.O. conceived the study, designed the experiments, critically evaluated the results, and wrote the manuscript. ACKNOWLEDGMENTS

ACCESSION NUMBERS

The authors thank all lab members for fruitful discussion. We are grateful to Prof. P.P. Di Fiore (IEO, Milan) for critically reading the manuscript and for his support. We thank all the patients who gave their informed consent and the clinical staff of the IEO Gynecology Division, the IEO Pathology Department, and the Biobank staff for providing the tissue samples. We thank G. Jodice, C. Luise, and all the members of Molecular Pathology Unit at IEO for their help in preparing the TMA and for the IHC staining. Work at The Novo Nordisk Foundation Center for Protein Research (CPR) is funded in part by a generous donation from the Novo Nordisk Foundation (grant number NNF14CC0001). The proteomics technology developments applied here are part of a project that has received funding from the European Union’s Horizon 2020 research and innovation programme (grant number 686547) and from the Danish Research Council (research career program FSS Sapere Aude to J.V.O.). We would like to thank the PRO-MS Danish Mass Spectrometry Platform for Functional Proteomics and the CPR Mass Spectrometry Platform for instrument support and assistance. J.V.O. was supported by the Danish Cancer Society (R90-A5844 KBVU) and Lundbeckfonden (R191-2015-703). C.F. was supported by Marie Curie IEF (FP7-PEOPLE-2009-IEF, project number 252594), EMBO Long-Term (ALTF 746-2009) post-doctoral fellowships, and the Wellcome Trust (107636/Z/15/Z). Work at the European Institute of Oncology was supported by Associazione Italiana Ricerca sul Cancro (AIRC; grant IG- 1462) and Association for International Cancer Research (AICR; grant 10-0091) (to U.C.), AIRC (fellowship number 12378) and Fondazione Umberto Veronesi postdoctoral fellowships (to A.V.), and Fondazione Istituto Europeo di Oncologia postdoctoral fellowships (to M.L.).

The accession number for the mass spectrometry proteomics data reported in this paper is ProteomeXchange: PXD003531 (Vizcaino et al., 2010) (project name: Proteomics of Primary cells derived from Ovarian Cancer; reviewer account details: [email protected]; password: DdbisBPj).

Received: October 6, 2016 Revised: January 4, 2017 Accepted: March 2, 2017 Published: March 28, 2017

Mass Spectrometry Analysis Peptide mixtures were analyzed using an EASY-nLC system (Proxeon) connected to a Q-Exactive mass spectrometer (Thermo Fisher Scientific), as described previously (Kelstrup et al., 2012). Raw Files Analysis Raw data were analyzed by the MaxQuant software suite (Cox and Mann, 2008), version 1.4.1.4, using the integrated Andromeda search engine (Cox et al., 2011). Only peptides with an Andromeda score >40 were included. Data Analysis The samples were grouped in three categories representing EOC, FTE, and OSE and we used the LIMMA package of Bioconductor in R (Wettenhall and Smyth, 2004) to detect significant changes in abundance among the three groups. Statistics All experiments were performed at least three times. The mass spectrometry data were normalized before further analysis. p values were calculated by Student’s two tailed t test, Wilcoxon test, or Fisher’s exact test, as indicated. A statistically significant difference was concluded when p < 0.05 or p < 0.001 as reported in the figure legends.

(B) Cell proliferation of ovarian cancer cells treated for 72 hr with the indicated inhibitors. Data represent the mean ± SEM of three experiments. *p value < 0.05 compared to untreated cells or cells treated with DMSO. Black line represents untreated cells at time 0. (C) Cell proliferation of cancer cells of different origin treated for 72 hr with THZ1 or DMSO as indicated. Data represent the mean ± SEM of three experiments. Black line represents untreated cells at time 0. (D) Lysates from OVCAR3, COV318, or COV362 cells, either not transfected or transfected with two different siRNA sequences against CDK7 or with a control siRNA, were subjected to immunoblotting with the indicated antibodies. (E) Cell proliferation of EOC cell lines treated as in (D) for 72 hr. Data represent the mean ± SEM of three experiments. *p < 0.05 compared to untreated cells or cells treated with control siRNA. Black line represents untreated cells at time 0.

3254 Cell Reports 18, 3242–3256, March 28, 2017

REFERENCES Aebersold, R., and Mann, M. (2016). Mass-spectrometric exploration of proteome structure and function. Nature 537, 347–355.

Guo, T., Kouvonen, P., Koh, C.C., Gillet, L.C., Wolski, W.E., Ro¨st, H.L., Rosenberger, G., Collins, B.C., Blum, L.C., Gillessen, S., et al. (2015). Rapid mass spectrometric conversion of tissue biopsy samples into permanent quantitative digital proteome maps. Nat. Med. 21, 407–413.

Asghar, U., Witkiewicz, A.K., Turner, N.C., and Knudsen, E.S. (2015). The history and future of targeting cyclin-dependent kinases in cancer therapy. Nat. Rev. Drug Discov. 14, 130–146.

Gurung, A., Hung, T., Morin, J., and Gilks, C.B. (2013). Molecular abnormalities in ovarian carcinoma: clinical, morphological and therapeutic correlates. Histopathology 62, 59–70.

Batth, T.S., and Olsen, J.V. (2016). Offline high pH reversed-phase peptide fractionation for deep phosphoproteome coverage. Methods Mol. Biol. 1355, 179–192.

Jersie-Christensen, R.R., Sultan, A., and Olsen, J.V. (2016). Simple and reproducible sample preparation for single-shot phosphoproteomics with high sensitivity. Methods Mol. Biol. 1355, 251–260.

Beck, M., Claassen, M., and Aebersold, R. (2011). Comprehensive proteomics. Curr. Opin. Biotechnol. 22, 3–8.

Jones, P.M., and Drapkin, R. (2013). Modeling high-grade serous carcinoma: how converging insights into pathogenesis and genetics are driving better experimental platforms. Front. Oncol. 3, 217.

Bowtell, D.D., Bo¨hm, S., Ahmed, A.A., Aspuria, P.J., Bast, R.C., Jr., Beral, V., Berek, J.S., Birrer, M.J., Blagden, S., Bookman, M.A., et al. (2015). Rethinking ovarian cancer II: reducing mortality from high-grade serous ovarian cancer. Nat. Rev. Cancer 15, 668–679. Cancer Genome Atlas Research Network (2011). Integrated genomic analyses of ovarian carcinoma. Nature 474, 609–615. Cao, K., and Shilatifard, A. (2014). Inhibit globally, act locally: CDK7 inhibitors in cancer therapy. Cancer Cell 26, 158–159. Cathomas, F., Stegen, M., Sigrist, H., Schmid, L., Seifritz, E., Gassmann, M., Bettler, B., and Pryce, C.R. (2015). Altered emotionality and neuronal excitability in mice lacking KCTD12, an auxiliary subunit of GABAB receptors associated with mood disorders. Transl. Psychiatry 5, e510.

Kelstrup, C.D., Young, C., Lavallee, R., Nielsen, M.L., and Olsen, J.V. (2012). Optimized fast and sensitive acquisition methods for shotgun proteomics on a quadrupole orbitrap mass spectrometer. J. Proteome Res. 11, 3487–3497. Kim, H., Wu, R., Cho, K.R., Thomas, D.G., Gossner, G., Liu, J.R., Giordano, T.J., Shedden, K.A., Misek, D.E., and Lubman, D.M. (2008). Comparative proteomic analysis of low stage and high stage endometrioid ovarian adenocarcinomas. Proteomics Clin. Appl. 2, 571–584. Kim, M.S., Pinto, S.M., Getnet, D., Nirujogi, R.S., Manda, S.S., Chaerkady, R., Madugundu, A.K., Kelkar, D.S., Isserlin, R., Jain, S., et al. (2014). A draft map of the human proteome. Nature 509, 575–581.

Christensen, C.L., Kwiatkowski, N., Abraham, B.J., Carretero, J., Al-Shahrour, F., Zhang, T., Chipumuro, E., Herter-Sprie, G.S., Akbay, E.A., Altabef, A., et al. (2014). Targeting transcriptional addictions in small cell lung cancer with a covalent CDK7 inhibitor. Cancer Cell 26, 909–922.

Kobuke, K., Furukawa, Y., Sugai, M., Tanigaki, K., Ohashi, N., Matsumori, A., Sasayama, S., Honjo, T., and Tashiro, K. (2001). ESDN, a novel neuropilin-like membrane protein cloned from vascular cells with the longest secretory signal sequence among eukaryotes, is up-regulated after vascular injury. J. Biol. Chem. 276, 34105–34114.

Cox, J., and Mann, M. (2008). MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat. Biotechnol. 26, 1367–1372.

Kulak, N.A., Pichler, G., Paron, I., Nagaraj, N., and Mann, M. (2014). Minimal, encapsulated proteomic-sample processing applied to copy-number estimation in eukaryotic cells. Nat. Methods 11, 319–324.

Cox, J., Neuhauser, N., Michalski, A., Scheltema, R.A., Olsen, J.V., and Mann, M. (2011). Andromeda: a peptide search engine integrated into the MaxQuant environment. J. Proteome Res. 10, 1794–1805.

Kurman, R.J., and Shih, IeM. (2016). The dualistic model of ovarian carcinogenesis: revisited, revised, and expanded. Am. J. Pathol. 186, 733–747.

Dias, M.H., Kitano, E.S., Zelanis, A., and Iwai, L.K. (2015). Proteomics and drug discovery in cancer. Drug Discov. Today 21, 264–277. Domcke, S., Sinha, R., Levine, D.A., Sander, C., and Schultz, N. (2013). Evaluating cell lines as tumour models by comparison of genomic profiles. Nat. Commun. 4, 2126. Du, H., Vitiello, D., Sarno, J.L., and Taylor, H.S. (2010). 3-Phosphoglycerate dehydrogenase expression is regulated by HOXA10 in murine endometrium and human endometrial cells. Reproduction 139, 237–245. Duncia, J.V., Santella, J.B., 3rd, Higley, C.A., Pitts, W.J., Wityak, J., Frietze, W.E., Rankin, F.W., Sun, J.H., Earl, R.A., Tabaka, A.C., et al. (1998). MEK inhibitors: the chemistry and biological activity of U0126, its analogs, and cyclization products. Bioorg. Med. Chem. Lett. 8, 2839–2844. Elschenbroich, S., Ignatchenko, V., Clarke, B., Kalloger, S.E., Boutros, P.C., Gramolini, A.O., Shaw, P., Jurisica, I., and Kislinger, T. (2011). In-depth proteomics of ovarian cancer ascites: combining shotgun proteomics and selected reaction monitoring mass spectrometry. J. Proteome Res. 10, 2286–2299. Fisher, R.P. (2005). Secrets of a double agent: CDK7 in cell-cycle control and transcription. J. Cell Sci. 118, 5171–5180. Francavilla, C., Rigbolt, K.T., Emdal, K.B., Carraro, G., Vernet, E., Bekker-Jensen, D.B., Streicher, W., Wikstro¨m, M., Sundstro¨m, M., Bellusci, S., et al. (2013). Functional proteomics defines the molecular switch underlying FGF receptor trafficking and cellular outputs. Mol. Cell 51, 707–722. Geiger, T., Madden, S.F., Gallagher, W.M., Cox, J., and Mann, M. (2012). Proteomic portrait of human breast cancer progression identifies novel prognostic markers. Cancer Res. 72, 2428–2439. Gonda, T.J., and Ramsay, R.G. (2015). Directly targeting transcriptional dysregulation in cancer. Nat. Rev. Cancer 15, 686–694.

Kwiatkowski, N., Zhang, T., Rahl, P.B., Abraham, B.J., Reddy, J., Ficarro, S.B., Dastur, A., Amzallag, A., Ramaswamy, S., Tesar, B., et al. (2014). Targeting transcription regulation in cancer with a covalent CDK7 inhibitor. Nature 511, 616–620. Lee, J.M., and Kohn, E.C. (2010). Proteomics as a guiding tool for more effective personalized therapy. Ann. Oncol. 21 (Suppl 7), vii205–vii210. Levanon, K., Ng, V., Piao, H.Y., Zhang, Y., Chang, M.C., Roh, M.H., Kindelberger, D.W., Hirsch, M.S., Crum, C.P., Marto, J.A., and Drapkin, R. (2010). Primary ex vivo cultures of human fallopian tube epithelium as a model for serous ovarian carcinogenesis. Oncogene 29, 1103–1113. Liu, Y., Zhang, X., Han, C., Wan, G., Huang, X., Ivan, C., Jiang, D., RodriguezAguayo, C., Lopez-Berestein, G., Rao, P.H., et al. (2015). TP53 loss creates therapeutic vulnerability in colorectal cancer. Nature 520, 697–701. Lundby, A., Andersen, M.N., Steffensen, A.B., Horn, H., Kelstrup, C.D., Francavilla, C., Jensen, L.J., Schmitt, N., Thomsen, M.B., and Olsen, J.V. (2013). In vivo phosphoproteomics analysis reveals the cardiac targets of b-adrenergic receptor signaling. Sci. Signal. 6, rs11. Luo, J. (2011). Cancer’s sweet tooth for serine. Breast Cancer Res. 13, 317. Mertins, P., Mani, D.R., Ruggles, K.V., Gillette, M.A., Clauser, K.R., Wang, P., Wang, X., Qiao, J.W., Cao, S., Petralia, F., et al.; NCI CPTAC (2016). Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 534, 55–62. Mitchell, C.J., Getnet, D., Kim, M.S., Manda, S.S., Kumar, P., Huang, T.C., Pinto, S.M., Nirujogi, R.S., Iwasaki, M., Shaw, P.G., et al. (2015). A multiomic analysis of human naı¨ve CD4+ T cells. BMC Syst. Biol. 9, 75. Mun˜oz, M.J., de la Mata, M., and Kornblihtt, A.R. (2010). The carboxy terminal domain of RNA polymerase II and alternative splicing. Trends Biochem. Sci. 35, 497–504.

Cell Reports 18, 3242–3256, March 28, 2017 3255

Neunteufel, W., and Breitenecker, G. (1989). Tissue expression of CA 125 in benign and malignant lesions of ovary and fallopian tube: a comparison with CA 19-9 and CEA. Gynecol. Oncol. 32, 297–302. Nilson, K.A., Guo, J., Turek, M.E., Brogie, J.E., Delaney, E., Luse, D.S., and Price, D.H. (2015). THZ1 reveals roles for Cdk7 in co-transcriptional capping and pausing. Mol. Cell 59, 576–587. Ntai, I., LeDuc, R.D., Fellers, R.T., Erdmann-Gilmore, P., Davies, S.R., Rumsey, J., Early, B.P., Thomas, P.M., Li, S., Compton, P.D., et al. (2016). Integrated bottom-up and top-down proteomics of patient-derived breast tumor xenografts. Mol. Cell. Proteomics 15, 45–56. Olsen, J.V., and Mann, M. (2013). Status of large-scale analysis of post-translational modifications by mass spectrometry. Mol. Cell. Proteomics 12, 3444– 3452. Pacold, M.E., Brimacombe, K.R., Chan, S.H., Rohde, J.M., Lewis, C.A., Swier, L.J., Possemato, R., Chen, W.W., Sullivan, L.B., Fiske, B.P., et al. (2016). A PHGDH inhibitor reveals coordination of serine synthesis and one-carbon unit fate. Nat. Chem. Biol. 12, 452–458. Papasaikas, P., Rao, A., Huggins, P., Valcarcel, J., and Lopez, A. (2015). Reconstruction of composite regulator-target splicing networks from highthroughput transcriptome data. BMC Genomics 16 (Suppl 10), S7. Pozniak, Y., Balint-Lahat, N., Rudolph, J.D., Lindskog, C., Katzir, R., Avivi, C., Ponte´n, F., Ruppin, E., Barshack, I., and Geiger, T. (2016). System-wide clinical proteomics of breast cancer reveals global remodeling of tissue homeostasis. Cell Syst. 2, 172–184. Ricci, F., Bizzaro, F., Cesca, M., Guffanti, F., Ganzinelli, M., Decio, A., Ghilardi, C., Perego, P., Fruscio, R., Buda, A., et al. (2014). Patient-derived ovarian tumor xenografts recapitulate human clinicopathology and genetic alterations. Cancer Res. 74, 6980–6990. Roskoski, R., Jr. (2012). ERK1/2 MAP kinases: structure, function, and regulation. Pharmacol. Res. 66, 105–143. Symeonides, S., and Gourley, C. (2015). Ovarian cancer molecular stratification and tumor heterogeneity: a necessity and a challenge. Front. Oncol. 5, 229.

3256 Cell Reports 18, 3242–3256, March 28, 2017

Tazi, J., Bakkour, N., and Stamm, S. (2009). Alternative splicing and disease. Biochim. Biophys. Acta 1792, 14–26. van den Biggelaar, M., Herna´ndez-Fernaud, J.R., van den Eshof, B.L., Neilson, L.J., Meijer, A.B., Mertens, K., and Zanivan, S. (2014). Quantitative phosphoproteomics unveils temporal dynamics of thrombin signaling in human endothelial cells. Blood 123, e22–e36. Vizcaino, J.A., Reisinger, F., Cote, R., and Martens, L. (2010). PRIDE: data submission and analysis. Curr. Protoc. Protein Sci. Chapter 25, Unit 25.4. von Stechow, L., Francavilla, C., and Olsen, J.V. (2015). Recent findings and technological advances in phosphoproteomics for cells and tissues. Expert Rev. Proteomics 12, 469–487. Waldemarson, S., Krogh, M., Alaiya, A., Kirik, U., Schedvins, K., Auer, G., Hansson, K.M., Ossola, R., Aebersold, R., Lee, H., et al. (2012). Protein expression changes in ovarian cancer during the transition from benign to malignant. J. Proteome Res. 11, 2876–2889. Wang, Y., Zhang, T., Kwiatkowski, N., Abraham, B.J., Lee, T.I., Xie, S., Yuzugullu, H., Von, T., Li, H., Lin, Z., et al. (2015). CDK7-dependent transcriptional addiction in triple-negative breast cancer. Cell 163, 174–186. Wettenhall, J.M., and Smyth, G.K. (2004). limmaGUI: a graphical user interface for linear modeling of microarray data. Bioinformatics 20, 3705–3706. Wilhelm, M., Schlegl, J., Hahne, H., Gholami, A.M., Lieberenz, M., Savitski, M.M., Ziegler, E., Butzmann, L., Gessulat, S., Marx, H., et al. (2014). Massspectrometry-based draft of the human proteome. Nature 509, 582–587. Zecchini, S., Bianchi, M., Colombo, N., Fasani, R., Goisis, G., Casadio, C., Viale, G., Liu, J., Herlyn, M., Godwin, A.K., et al. (2008). The differential role of L1 in ovarian carcinoma and normal ovarian surface epithelium. Cancer Res. 68, 1110–1118. Zhang, H., Liu, T., Zhang, Z., Payne, S.H., Zhang, B., McDermott, J.E., Zhou, J.Y., Petyuk, V.A., Chen, L., Ray, D., et al.; CPTAC Investigators (2016). Integrated proteogenomic characterization of human high-grade serous ovarian cancer. Cell 166, 755–765.

Cell Reports, Volume 18

Supplemental Information

Phosphoproteomics of Primary Cells Reveals Druggable Kinase Signatures in Ovarian Cancer Chiara Francavilla, Michela Lupia, Kalliopi Tsafou, Alessandra Villa, Katarzyna Kowalczyk, Rosa Rakownikow Jersie-Christensen, Giovanni Bertalot, Stefano Confalonieri, Søren Brunak, Lars J. Jensen, Ugo Cavallaro, and Jesper V. Olsen

To be considered as a Resource Article in Cell Reports

SUPPLEMENTAL INFORMATION

Supplemental Information contains: -

Six Supplemental Figures and Figure Legends.

-

Supplemental Table Legends.

-

Supplemental Experimental Procedures.

-

Supplemental References.

1

To be considered as a Resource Article in Cell Reports

SUPPLEMENTAL FIGURES AND TABLES LEGENDS

2

To be considered as a Resource Article in Cell Reports

3

To be considered as a Resource Article in Cell Reports

Figure S1. Related to Figure 1. Patient-derived cells are of epithelial origin. (A) Representative images of patient-derived cells in culture show morphology characteristic of epithelial cells (FTE, OSE). Spindle-shapes EOC cells (right panel) suggests an epithelial-to-mesenchymal transition (Davidson et al., 2012). Scale bar, 50 m. (B) Patient-derived cells from the indicated tissue were stained with the endothelial marker CD31, the lymphocyte marker CD45, and the fibroblast marker FAP in combination with the epithelial marker cytokeratin 5 (CK5) to show the purity of the epithelial cell preparation. Fresh human leukocytes were used as positive control for CD31 (top panel on the right) or CD45 (middle panel on the right) markers. Rare fibroblasts (less than 0.1% of the whole cell preparation) were used as the positive control for FAP staining (bottom panel on the right). Representative images are shown. Scale bar, 50 m. (C) Patient-derived primary cells from the indicated tissue were stained with the following specific markers to confirm their origin: calretinin is an OSE marker; PAX8 marks the secretory cells of FTE; and WT1 is a marker of both OSE and FTE. As previously reported (Davidson and Trope, 2014), EOC cells were positive for both PAX8 and WT1. Cytokeratin 7 (CK7), an established marker of the female gynecological tract epithelium, was abundant in all primary cell types.

4

To be considered as a Resource Article in Cell Reports

5

To be considered as a Resource Article in Cell Reports

Figure S2. Related to Figure 1. MS data quality control shows robustness and reproducibility. The distribution of peptide mass error (A) and score of the phosphorylated peptides (B) shows that most of the peptides were identified with high mass accuracy (mass error less than 2 p.p.m.) and high Andromeda score. (C) Distribution of Serine (Ser), Threonine (Thr) and Tyrosine (Tyr) phosphorylated sites identified in this study. (D) Distribution of phosphopeptides with one, two, or more phosphorylated sites. (E, F) The number of overlapping proteins (E) or phosphorylated sites (F) within technical replicates in EOC shows a high reproducibility. (G) The density plots of proteome (left) and phosphoproteome (right) between all technical replicates based on intensity shows an optimal reproducibility in EOC. (H) Unsupervised clustering of the proteome dataset shows separation between tumor and healthy samples. (I) Heatmap of the Pearson´s correlation (R2) of the proteome data shows good overall reproducibility among samples deriving from the same tissue (highlighted by black boxes). Numbers and letters indicate individual samples or technical replicates, respectively, according to Table S1. (J) The density plots based on intensity shows poor correlation between the abundance of phosphorylated peptides and that of their corresponding protein in all the three cell types.

6

To be considered as a Resource Article in Cell Reports

7

To be considered as a Resource Article in Cell Reports

Figure S3. Related to Figure 4. CDK7 and splicing events are regulated in EOC. (A) Number of proteins and phosphorylated sites in Clusters A-C (see Figure 4A). (B) Venn diagram showing the overlap between the phosphorylated proteins identified in this study and in Zhang et al., (Zhang et al., 2016). (C) Network of proteins belonging to cluster A (see Figure 4A) based on STRING and visualized in Cytoscape (D) List of the phosphorylated sites up-regulated in EOC and FTE (cluster B in Figure 4A). CDK7, *: the doubly phosphorylated peptide is reported (see Tables S3 and S4). (E) Representative MS/MS spectrum of the CDK7 doubly phosphorylated peptide. (F) Representative MS/MS spectrum of one of the phosphorylated peptides of POLR2A. (G) Pipeline of the analysis of splicing variants in the dataset. (H) Number of deregulated peptides found in proteins with alternative variants by comparing EOC and healthy tissues.

8

To be considered as a Resource Article in Cell Reports

9

To be considered as a Resource Article in Cell Reports

Figure S4. Related to Figure 4. The GO-terms spliceosome and signaling are enriched in EOC. Network of proteins whose phosphorylated sites were significantly enriched in EOC compared to healthy tissues based on STRING and Dapple and visualized in Cytoscape. Colored clouds show the top-regulated biological processes. Triangles represent genes whose mutations have been found in the COSMIC database (ovarian and breast cancers only). This supported the depth of our analysis. Among the highly interconnected groups of proteins we found proteins involved in splicing and translation as up regulated in EOC, whereas sites on signaling proteins were down regulated in EOCs. One-third of all EOC specific sites in cluster C were associated with the GO term spliceosome (Table S4). Phosphorylation sites on proteins involved in chromosome organization and cell cycle were up- regulated in both FTE and EOC. These findings highlight these important cellular functions as targets of differential phosphorylation between EOC and healthy cells.

10

To be considered as a Resource Article in Cell Reports

11

To be considered as a Resource Article in Cell Reports

Figure S5. Related to Figure 5. CDK7 and POLR2A are phosphorylated in human tissues and ovarian cancer cell lines and are interconnected. (A) Lysates from different Ovarian Cancer (EOC) cell lines were subjected to immunoblotting with the indicated antibodies. Vinculin was used as loading control. The experiment was repeated twice with analogous results. Both CDK7 and POLR2A were phosphorylated in all the tested EOC cell lines, suggesting that this is a general hallmark of these cells. (B) Validation of the phosphorylation of CDK7 by immunohistochemistry (IHC) on independent formalin-fixed, paraffin-embedded tissue biopsies (see Table S1). Representative serial images are shown on the left. Scale bar, 100 m. Enlarged sections from the rectangular area are represented on the right of the respective staining. Tissue sections incubated with the secondary antibodies alone showed no staining and are not shown. Arrows indicate the OSE in normal ovary, the FTE in normal fimbriae, and the tumor tissue in EOC. Right panel: summary table of the IHC signal intensities (arbitrary unit) of each target protein (columns) in each tissue (rows). Active CDK7 was strongly represented in the nuclei of EOC and FTE, whereas it was heterogeneous in OSE and detected in groups of cells. (C) Recurrent EOC retains the expression of active and total CDK7. Two independent samples of recurrent high-grade serous ovarian carcinoma from patients who had been previously treated with primary debulking surgery followed by carboplatin/paclitaxel chemotherapy (Table S1) were stained for active (p-CDK7) or total (CDK7). Representative images are shown. Control staining was performed omitting the primary antibodies (no primary Ab). Scale bar, 100 m. Enlarged sections from the rectangular area are represented on the right of the respective staining. The images refer to two representative cases out of the fourteen recurrent EOC samples that were analysed and found to have active CDK7. (D) Network of known CDK7-associated proteins based on STRING and visualized in Cytoscape with an emphasis on their role in cell cycle regulation or RNA splicing. The significance in ovarian (dark pink) or other cancer type (light pink) is based on (Lawrence et al., 2014). Border colors indicates relation of the

12

To be considered as a Resource Article in Cell Reports

genes to other diseases according to the database DISEASES (Pletscher-Frankild et al., 2015). The color of the labels indicates the regulation of sites in patient-derived cells based on the comparison between EOC and healthy tissues. (E) Network of the POLR2A-associated proteins identified in the phosphoproteome of patient-derived cells with specific biological functions and belonging to Clusters A-C (see Figure 4A) based on STRING and visualized in Cytoscape.

13

To be considered as a Resource Article in Cell Reports

14

To be considered as a Resource Article in Cell Reports

Figure S6. Related to Figure 5. OVCAR3 phosphoproteomics reveals novel CDK7-associated proteins. (A) Workflow of the OVCAR3 phosphoproteomics experiment performed in triplicates. (B) Heatmaps of the Pearson´s correlation (R2) of the OVCAR3 phosphoproteome show good overall reproducibility among biological replicates (highlighted by black boxes). See also Table S6. The analysis of the proteome from the same cells did not reveal any significant difference upon four hour treatment (data not shown), therefore we compared only the phosphoproteome. The distribution of peptide mass error (C) and score of the phosphorylated peptides (D) shows that most of the peptides were identified with high mass accuracy (mass error less than 2 p.p.m.) and high Andromeda score. (E) Distribution of Serine (Ser), Threonine (Thr) and Tyrosine (Tyr) phosphorylated sites identified in OVCAR3 cells. (F) Distribution of phosphopeptides with one, two, or more phosphorylated sites. (G) Network of the CDK7-associated proteins from Figure 5C based on STRING and visualized in Cytoscape.

Table S1. Related to Figures 1, 2, 4, 5, and to Experimental Procedures. List of patientderived samples used in this study. N: arbitrary and sequential numbers. Samples 1, 2, and 4 have been separated in two technical replicates. Sample: samples derived from normal FTE (FTE), normal ovarian surface epithelium (OSE) or from ascites of patients with high grade serous ovarian cancer (EOC). Samples 43 to 56 are independent samples of recurrent high-grade serous ovarian carcinoma from patients who had been previously treated with primary debulking surgery followed by carboplatin/paclitaxel chemotherapy. Diagnosis: diagnosis of the patient the sample derived from. Technique: MS analysis, WB, IHC, IF indicate whether the sample has been used for the proteomic, western blot, immunohistochemical, imaging analysis, respectively. In case of IHC we did not use patient-derived epithelial cells but part of the entire biopsy. Please, note that we have

15

To be considered as a Resource Article in Cell Reports

used different sample set for different analysis depending on the amount of cells or proteins available from each primary culture. Note: additional information on a particular sample.

Table S2. Related to Figures 1 and 2. List of all the identified proteins in patient-derived cells. Of the 5561 identified proteins we filtered proteins identified with less than two peptides and with poor sequence coverage (less than 5%) before further analysis (4457 proteins). MUC16/CA125 and PRSS8 are identified by an asterisk. Proteins in red were differentially expressed in the three tissues based on intensity. The log10 intensity is reported for all the samples as well as the normalized mean. We also reported the significant p value after the analysis with the LIMMA Bioconductor package in R and the cluster.

Table S3. Related to Figures 1 and 3-5. List of the identified phosphorylated sites in patientderived cells. Sequence window, phosphorylated amino acid residue, the log10 intensity of all phosphorylated sites with localization probabilities higher than 0.75 (class I), the normalized mean, the GO terms, the significant p value after the analysis with the LIMMA Bioconductor package in R, the significance in the PCA analysis or after the comparison with Lawrence et al., (Lawrence et al., 2014) are reported for all the samples. CDK7 is shown in red. For the 792 differentially expressed phosphorylated sites in patient-derived cells, the p-values < 0.05 and p-values >=0.05 are color-coded in shades of red and of blue, respectively. The phosphorylated sites whose proteins have been also identified in the proteome dataset are marked with a YES. If the p-value remained significant also after the normalization of the intensity of the site over the intensity of its protein they were marked with a „+‟ (175 out of 665 sites). The peptide found in proteins with splicing variants were marked with a „+‟. CDK7 and POLR2A are shown in red. CDK7: the doubly phosphorylated peptide is reported.

16

To be considered as a Resource Article in Cell Reports

Table S4. Related to Figures 3-4. List of the phosphorylated proteins belonging to the GO term splicesosome. Sequence window, phosphorylated amino acid residue, the biological function, the significance in the PCA analysis or after comparison with the proteome analysis, the kinase motif, the function according to (Papasaikas et al., 2015) are reported. POLR2A is shown in red.

Table S5. Related to Figure 4. List of known CDK7-associated proteins. The list of known POLR2A-associated proteins identified in patient-derived cells is highlighted in blue.

Table S6. Related to Figure 5. List of the identified phosphorylated sites in OVCAR3. Sequence window, phosphorylated amino acid residue, the log10 intensity of all phosphorylated sites with localization probabilities higher than 0.75 (class I), motif analysis, the GO terms, biological functions, the presence in patient-derived cells, the name of the cluster are reported. POLR2A is shown in red. References for the column “Domains for splicing?”: (a) PMID: 11121472; (b) PMID: 24795046; (c) PMID: 12105215; (d) PMID: 9671816; (e) PMID: 10827081.

Table S7. Related to Figure 4. Perl script used to generate Figure S3 G-H.

SUPPLEMENTAL EXPERIMENTAL PROCEDURES Reagents The following commercial reagents were used: DMSO (Sigma-Aldrich, St. Louise, MO); the EGFR inhibitor AG1478, the PI3K inhibitor LY294002, and the MEK inhibitor U0126 (Cell Signaling Technology, Danvers, MA); the CDK7 inhibitor THZ1 (ApexBio, Houston, US).

17

To be considered as a Resource Article in Cell Reports

Antibodies: mouse anti-vinculin and mouse anti-tubulin (V9264 and T8203; Sigma-Aldrich;); mouse anti-phospho-ERK1/2 (E10) and rabbit anti-ERK1/2 (137F5) (9106 and 4695; Cell Signaling Technology); rabbit anti- phospho S473 AKT and anti-AKT (4060 and 4691; Cell Signaling Technology); rat anti-phospho S5 POR2A (04-1572; Millipore; (Kwiatkowski et al., 2014)); rabbit anti-POLR2A and anti-CDK7 (sc-17798 and sc-7344, Santa-Cruz Biotechnologies (Kwiatkowski et al., 2014)); mouse anti-CA125/MUC16 (CA1004; Millipore); rabbit anti-PHGDH, anti-KCTD12 and anti-phospho

T170 CDK7(PA5-27578, PA5-26281 and PA5-12573;

ThermoFisher

(Kwiatkowski et al., 2014)); rabbit anti-DCBLD2 (NBP1-85582; Novus Biologicals); mouse antiCD31 (M0823; Dako); mouse anti-CD45 (MAB1430; R&D system); rabbit anti-CK5 (PA5-29670; Invitrogen); rat anti-FAP (MABS1001; Vitatex); mouse anti-WT1 (M3561; Dako); rabbit antiPAX8 (10336-1-AP; ProteinTech); rabbit anti-calretinin (ab92341; Abcam); rabbit anti-CK7 (ab68459; Abcam). Primary cells culture Fresh tissue samples were obtained from patients undergoing surgery at the Gynaecology Division of the European Institute of Oncology (Milan). All tissues were collected via standardized operative procedures approved by the Institutional Ethical Board (European Institute of Oncology), and informed consent was obtained for all tissue samples linked with clinical data. To derive primary ovarian surface epithelial (OSE) cells and fallopian tube epithelial (FTE) cells, healthy ovarian cortical tissues and fimbriae were incubated with dispase (5 U/mL, Stemcell Technologies) for 30 min at 37°C, 5% CO2; the surface epithelium was then scraped off with a surgical scalpel in order to obtain an organoid preparation. Red blood cells were eliminated by a 2 min. treatment with ACK (Ack Cell Lysing Buffer, Lonza) at room temperature. High-grade serous epithelial ovarian cancer (EOC) cells were derived either from peritoneal fluid (ascites) or from tumor biopsies. Immediately after paracentesis, ascitic fluid was centrifuged to obtain a pellet containing organoids and single

18

To be considered as a Resource Article in Cell Reports

cells. Red blood cells were eliminated by two 5 min.. Solid tumor biopsies were minced into small fragments with scissors, followed by enzymatic treatment with 100 U/mL hyaluronidase and 200 U/mL collagenase at 37°C in humidified incubator 5% CO2 for two hours. Enzymes were diluted in Ham‟s F12/DMEM 1:1 (Microtech and Lonza, respectively), supplemented with 2mM L-glutamine, 100 U/mL penicillin and 100 μg/mL streptomycin. Tumor dissociation was monitored every half hour. Once a suspension of small tumor cell clusters and single cells was obtained, red blood cells were eliminated by ACK treatment. All primary epithelial cells were cultured on collagen I-coated plates (Collagen Cellware, Biocoat, Corning) in Ham‟s F12/DMEM 1:1, supplemented with 1% fetal bovine serum, 2mM L-glutamine, 100 U/mL penicillin, 100 μg/mL streptomycin, 100 μg/mL gentamycine, 0,5 μg/mL Amphotericin B, 10 μg/mL human transferrin, 1 μg/mL human insulin, 1 μg/mL hydrochortisone, 10mM Hepes, 10μM L-ascorbic acid, 15nM sodium selenite, 0.1 mM ethanolamine, 50ng/mL cholera toxin, 10nM epidermal grow factor EGF, Bovine Pituitary Extract 35 μg/mL, 10nM β-estradiolo, 10nM Triiodothyronine, in a humidified incubator with 5% CO2. All primary cell batches were cultured for a maximum of three passages after tissue digestion in order to preserve the biological features and the heterogeneity of original tissue. Cell lines Human ovarian carcinoma cell lines (EOC) OVCAR3, IGROV1, HeyA8, and SKOV3, human epithelial cervix carcinoma HeLa cells, breast adenocarcinoma cells MCF7, lung adenocarcinoma A549 cells were purchased from ATCC. COV362 and COV318 were purchased from Sigma. All cells were tested for mycoplasma with a PCR-based method every third week. Cells were maintained in a humidified incubator with 5% CO2. IGROV-1 cells were cultured in RPMI 1640 (Lonza) supplemented with 10% fetal bovine serum, 2mM L-glutamine, 100 U/mL penicillin and 100 μg/mL streptomycin; SKOV3 cells were cultured in McCoy‟s 5A (Life Technologies) supplemented with 10% fetal bovine serum, 100 U/mL penicillin and 100 μg/mL streptomycin.

19

To be considered as a Resource Article in Cell Reports

HeyA8, COV362, and COV318 cells were cultured in DMEM supplemented with 10% fetal bovine serum, 2mM L-glutamine, 100 U/mL penicillin and 100 μg/mL streptomycin. OVCAR3 cells were cultured in RPMI 1640 medium supplemented with 20% fetal bovine serum, 10 μg/mL insulin, 100 U/mL penicillin and 100 μg/mL streptomycin. HeLa, A549, and MCF-7 cells were cultured in DMEM (Gibco, Invitrogen), supplemented with 10% fetal bovine serum, 100U/mL penicillin (Invitrogen), 100μg/mL streptomycin (Invitrogen). All experiments were performed at 80% confluence. Cell Lysis and Western blots EOC cell lines were cultured in Petri dishes in complete medium and treated for 4 hours with chemical inhibitors at the indicated concentrations. Control cells were pre-incubated with DMSO alone. After treatment, cell extraction and immunoblotting were performed as described (Francavilla et al., 2013). Each experiment was repeated three times. The pellet of primary cells not used for Mass Spectrometry analysis was directly used for immunoblotting. Transfection and RNA interference OVCAR3 cells were transfected using Lipofectamine (Invitrogen), according to manufacturer‟s instructions, and all the assays were performed 36 hours after transfection. Double-strand, validated Stealth

siRNA

oligonucleotides

targeting

human

CDK7

(sequences:

5‟-

UUAAGGUUUCCACUGGACAGUUUGG -3‟; 5‟-UCACACAUCAAAGCCUACAUGUUGA-3) were purchased from Invitrogen. Cells were transfected either with two sequences in order to test off-target effects. A Negative Control Med GC siRNA duplex was used as a negative control (Invitrogen, Cat. Num. 12935300). Silencing of gene expression was monitored by immunoblotting of cell lysates with antibodies against CDK7. Cell Proliferation Assay

20

To be considered as a Resource Article in Cell Reports

EOC cell lines were seeded in quadruplicate on 24-well plates at 2x104cells/well and treated for three days with chemical inhibitors replenished every 24 hours. Viable cells were counted using the Trypan blue exclusion method and the ratio to unstimulated cells at time 0 was determined for each time point, as previously described (Francavilla et al., 2013). Values represent the mean ± s.e.m. from four independent experiments. Immunofluorescence Primary cells (see Table S1) were grown on glass coverslips coated with collagen I from calf skin (Sigma), and then fixed with 4% PFA for 10 min at room temperature. When needed, cells were permeabilized for 3 minutes at 4°C with 0.5% Triton X-100 in PBS. After treatment for 30 min at room temperature with blocking solution (1% donkey serum and 0.2% BSA in PBS, 0.05% Tween20) primary antibodies diluted in blocking solution were added for 2 hours at room temperature. Anti-CD31, anti-CD45 and anti-WT1 primary antibodies were detected with Cy3conjugated donkey anti-mouse antibody; anti-CK5, anti-calretinin, anti-PAX8 and anti-CK7 were detected with Cy3-conjugated donkey anti-rabbit antibody; anti-FAP with an Alexa Fluor 488conjugated anti-rat. All secondary antibodies were incubated for 1 hour at room temperature in blocking solution, and nuclei were counterstained with DAPI. Coverslips were then mounted with Mowiol. Images were acquired with an OLYMPUS BX63 microscopy, 20x objective, and processed by the software Fiji. The EVOS fl Digital Inverted Fluorescence Microscope was used for the bright-field images of primary cells shown in Supplementary Fig 1a. The experiments were repeated at least twice and representative images are shown. Immunohistochemistry (IHC) staining The immunohistochemical analysis of CDK7 and phospho-CDK7 expression was carried out on 4 normal ovaries, 4 normal FTE, and 4 EOC (Table S1). Immunostaining was performed on 3-μm sections from formalix-fixed, paraffin-embedded tissue samples. Tissue samples were

21

To be considered as a Resource Article in Cell Reports

deparaffinized, rehydrated, and pretreated for antigen retrieval by incubating the slides in the appropriate buffer95°C for 50 min. EDTA pH8 was used for CDK7, and phosho-CDK7. This was followed by the inactivation of endogenous peroxidases with 3% H2O2, 5 min at room temperature. For the CDK7 staining, endogenous peroxidases were inactivated by 0.03% H2O2 in methanol for 20 min at room temperature. Primary antibodies were added for 2 hours at room temperature or, for anti-CDK7, over-night at 4°C. Dako EnVision+ System-HRP Labelled Polymer was used for detection, in combination with Dako chromogen substrate (Liquid DAB+ Substrate Chromogen System). Sections were counterstained with hematoxilin. Pictures of stained sections were acquired with the scanner Aperio ScanScope XT, 20x objective. The antigen expression in whole tissue sections was analyzed and scored by a trained pathologist (GB). Tissue Microarray (TMA) Tissue microarray (TMA) analysis was carried out as previously described (Zecchini et al., 2008). Clinicopathological and follow-up data of ovarian cancer patients operated at the European Institute of Oncology (Milano, Italy) from 1995 to 2004 were used to select the cases that were included in the study. Inclusion criteria were: a) first surgery at the European Institute of Oncology; b) no neoadjuvant treatment; c) diagnosis of serous ovarian carcinoma. We selected 15 OSE, 18 Fimbriae, 84 serous adenocarcinomas and 34 metastases. TMA was constructed by using the tissue arrayer MiniCore 3 (Alphelys, Plaisir, France). IHC on TMA was carried out with the anti-PHGDH and anti-KCTD12 antibodies. TMAs were deparaffinized, rehydrated, and pretreated in Tris-EDTA buffer pH 9 at 95°C for 50 min for antigen retrieval, followed by the incubation with the primary antibodies at room temperature for 2 hours. As detection system, the Dako EnVision+ System-HRP Labelled Polymer was used in combination with Dako chromogen substrate (Liquid DAB+ Substrate Chromogen System). Automatized immunohistochemistry of TMA was performed with Leica BOND-MAX. Any detectable level of PHGDH or KCTD12 signal was scored as positive

22

To be considered as a Resource Article in Cell Reports

according to an arbitrary scale from 1 to 3. Pictures of stained TMA were acquired with the scanner Aperio ScanScope XT, 20x objective. The antigen expression in TMAs was analyzed and scored by a trained pathologist (GB). Sample preparation for Mass-Spectrometry Primary Cells We analyzed five FTE, four OSE and four EOC from individual patients (see Table S1). Three of the four EOC were separated into two distinct samples and analyzed as technical replicates (see Figure S2). The pellet of primary cells was dissolved in denaturation buffer (6 M urea, 2 M thiourea in 10 mM HEPES pH 8) and protein concentration was determined using BioRad Bradford reagent (Bio-Rad, Hercules, CA). We obtained about 700 g of proteins from each sample. Cysteines were reduced with 1 mM dithiothreitol (DTT) and alkylated with 5.5 mM chloroacetamide (CAA). Proteins were digested with endoproteinase Lys-C (Wako, Osaka, Japan) and sequencing grade modified trypsin (modified sequencing grade, Sigma) after four-fold dilution in deionized water. Protease activity was quenched by addition of trifluoroacetic acid (TFA) to a final concentration of 1%. Precipitates were removed by centrifugation for 10 min at 3,000 g. Peptides were purified using reversed-phase Sep-Pak C18 cartridges (Waters, Milford, MA) and eluted with 50% acetonitrile. A small amount of the eluted peptides (1%) was taken for proteome analysis: after evaporation in a speed vacuum, 40 μL of 0.1% TFA, 5% ACN were added followed by MS analysis. The remaining peptides were used for the analysis of phosphoproteome as previously described (Jersie-Christensen et al., 2016). Briefly, 6 mL of 12% TFA in ACN was added to the eluted peptides and subsequently enriched with TiO2beads (5 μm, GL Sciences Inc., Tokyo, Japan). The beads were suspended in 20 mg/mL 2,5-dihydroxybenzoic acid (DHB), 80% ACN, and 6% TFA and the samples were incubated in a sample to bead ratio of 1:2 (w/w) in batch mode for 15 min with rotation. After 5 min centrifugation the supernatant were collected and incubated a second

23

To be considered as a Resource Article in Cell Reports

time with a two-fold dilution of the previous bead suspension. The beads were washed with 10% ACN, 6% TFA followed by 40% ACN, 6% TFA and collected on C8 STAGE-tips and finally washed by 80% ACN, 6% TFA. Elution of phosphorylated peptides was done with 20ul 5% NH3 followed by 20 μL 10% NH3 in 25% ACN, which were evaporated to a final volume of 5 μL in a sped vacuum. The concentrated phosphorylated peptides were acidified with addition of 20 μL 0.1% TFA, 5% ACN and loaded on C18 STAGE-tips. Peptides were eluted from STAGE-tips with 20 μL of 40% ACN followed by 10 μL 60% ACN and reduced to 5 μL by SpeedVac and added of 5 μL 0.1% TFA, 5% ACN. EOC cell line OVCAR3 We analyzed triplicates of label-free lysates from OVCAR3 treated with DMSO as control, with 300 nM THZ1, or with 20 nM U0126. We followed the same procedure described above to investigate both the proteome (not shown) and the phosphoproteome (Table S6) of OVCAR3. We analyzed 10 mg proteins from each sample. Mass-Spectrometry analysis Primary Cells Peptide mixtures were analyzed using an EASY-nLC system (Proxeon, Odense, Denmark) connected to a Q Exactive mass spectrometer (Thermo Fisher Scientific, Bremem, Germany) through a nanoelectrospray ion source. Peptides were separated in a 15 cm analytical column (75 µm inner-diameter) in-house packed with 1.9 µm reversed-phase C18 beads (Reprosil-Pur AQ, Dr. Maisch, Ammerbuch, Germany) with a 90 min gradient from 6% to 80% ACN in 0.5% acetic acid at a flow rate of 250 nl/min. Standard mass spectrometric parameters were as follows: spray voltage, 2 kV; no sheat and auxiliary gas flow, heated capillary temperature, 275°C; S-lens RF level of 50%. The Q-Exactive was operated in data-dependent acquisition mode using the “sensitive scanning method”, as described (Kelstrup et al., 2012). Full-scan MS spectra (m/z 300-1,750,

24

To be considered as a Resource Article in Cell Reports

resolution 70,000 at m/z 200) were detected in the Orbitrap analyzer after accumulation of ions at 1e6 target value based on predictive AGC from the previous scan. For every full-scan the 10 most intense ions were isolated and fragmented (collision energy: 25%) by higher-energy collisional dissociation (HCD) with a fixed injection/fill time of 120 ms and 35,000 resolution. Finally, the dynamic exclusion was set to 30 seconds. OVCAR3 Phosphorylated peptides-enriched samples were separated on an in-house packed 50 cm capillary column with 1.9 μm Reprosil-Pur C18 beads using an EASY-nLC 1000 system (Thermo Scientific, Odense, Denmark). The column temperature was maintained at 50°C using an integrated column oven (PRSO-V1, Sonation GmbH, Biberach, Germany). The flow rate of the gradient was 250 nl/min and started at 5% buffer B (80% ACN, 0.1% Formic Acid) going to 25% buffer B in 220 min followed by a 30 min step going to 60% buffer B and continuing for a 5 min wash and returning to 5% in 5 min and continuing for re-equilibration for 5 min. The Q-Exactive HF instrument (Thermo Scientific, Bremen, Germany) was run in a data dependent top 10 mode with the following settings. Spray voltage was set to 2 kV, S-lens RF level at 50, and heated capillary at 275 °C. Full scan resolutions were set to and 120 000 at m/z 200 and scan target was 3 × 106 with a maximum fill time of 20 ms. Mass range was set to 300–1750 and dynamic exclusion to 20 s. Target value for fragment scans was set at 2 × 105 with a resolution of 60 000 and a maximum fill time of 108 ms and underfill ratio was 4%. Normalized collision energy was set at 28. For proteome samples the mass spectrometer was run with a top 12 method with a resolution for fragment scans of 30 000 and a maximum fill time of 45 ms. Raw files analysis Raw data were analyzed by the MaxQuant software suite (Cox and Mann, 2008), version 1.4.1.4 using the integrated Andromeda search engine (Cox et al., 2011). Proteins were identified by searching the HCD-MS/MS peak lists against a target/decoy version of the human Uniprot database 25

To be considered as a Resource Article in Cell Reports

that consisted of the complete proteome sets and isoforms (v 3.37) supplemented with commonly observed contaminants such as porcine trypsin and bovine serum proteins. Tandem mass spectra were initially matched with a mass tolerance of 7 ppm on precursor masses and 0.02 Da or 20 ppm for fragment ions. Cysteine carbamidomethylation was searched as a fixed modification. Protein Nacetylation, N-pyro-glutamine, oxidized methionine, and phosphorylation of serine, threonine, and tyrosine were searched as variable modifications. Protein N-acetylation, oxidized methionine and deamidation of asparagine and glutamine were searched as variable modifications for the proteome experiments. Label-free parameters were used as described (Cox et al., 2014). False discovery rate was set to 0.01 for peptides, proteins and modification sites. Minimal peptide length was six amino acids. Site localization probabilities were calculated by MaxQuant using the PTM scoring algorithm (Olsen et al., 2006). The dataset were filtered by posterior error probability to achieve a false discovery rate below 1% for peptides, proteins and modification sites. Only peptides with Andromeda score >40 were included. Data analysis The samples were grouped in three categories representing EOC, FTE and OSE. For the analysis of phosphorylated peptides, only peptides with site localization probability of at least 0.75 (class I, shown in Tables S3 and S8) (Olsen et al., 2006) were included in the bioinformatics analyses. Both for the proteome and for the phosphoproteome analysis of primary cells we considered 8 out of 13 values that were log transformed and normalized using the function „normalizeQuantiles‟ included in the LIMMA package of Bioconductor in R. The LIMMA package from Bioconductor ((Wettenhall and Smyth, 2004) add http://www.ncbi.nlm.nih.gov/pubmed/16646809) was chosen to detect significant changes in abundance between the three groups because it accounts for different variability between samples and also between proteins. All values were centered to the mean. The samples were analyzed as pairs and we considered changes in abundance significant if observed in

26

To be considered as a Resource Article in Cell Reports

at least two of the three pair of samples based on a false discovery rate of < 0.05. We represented the modified peptides/proteins whose value differed compared to the mean of all samples. For the analysis of OVCAR3 phosphoproteome we calculated the difference of the phosphorylated peptide intensity between treated and untreated controls and considered significant a difference of more than 2 or less than minus 2 after log transformation of the data. Hierarchical and unsupervised clustering and the analysis of kinases motif were performed using the Perseus software (Tyanova et al., 2016). For 3D-PCA, performed using the R package “pca.3d”, missing values have not been considered. For identification of potential kinase motifs the sequence window of the regulated phosphorylation sites were compared to the non-regulated sites using the IceLogo resource web with default parameters (Colaert et al., 2009). KEGG and GO term analyses were performed using DAVID (Huang da et al., 2009). The default parameters were used, with Benjamin-Hochberg p-value adjustment threshold of 0.05. Significantly over-represented GO terms within the data were represented in bar plots. All the protein interaction networks were obtained using the STRING v9.1 protein interaction database (Franceschini et al., 2013). To ensure high confidence, interactions derived from the Experiments and Databases evidence channels were retrieved and a confidence score above 0.8 was required. To identify hyperconnected proteins in Figure S 7 the DAPPLE (Disease Association Protein-Protein Link Evaluator) tool was used (Rossin et al., 2011). The DAPPLE algorithm builds interaction networks among the proteins of interest and computes statistical significance of their connectivity. Data visualization was performed using the software Cytoscape (version 3.1 or 3.2.0). Genes reported to be significantly mutated in cancer were retrieved from (Lawrence et al., 2014). The analysis of splicing components was done as previously reported (Papasaikas et al., 2015). The relation of genes to other diseases was based on the database DISEASES (Pletscher-Frankild et al., 2015) (Figure S5). Genes mutated in cancer were taken from the COSMIC database and we

27

To be considered as a Resource Article in Cell Reports

considered only ovarian and breast cancers (http://cancer.sanger.ac.uk/cosmic). For the analysis of the proteins with splicing variants, the amino acid sequence fasta file of human Uniprot database that consisted of the complete proteome sets and isoforms (v 3.37) was used as a reference database and the enzymatic digestion of all proteins with alternative variants was calculated (see Perl Script in Table S7). The produced peptides found to be protein specific were further annotated as variant specific or variant non-specific. Next a Fisher‟s exact test was performed to determine statistical significance between peptides belonging to proteins with splicing variants whose phosphorylated sites were regulated in EOC compared to healthy tissues. Statistics All the experiments have been performed in at least triplicates. All the MS experiments have been performed with n> 3 to ensure adequate power to the analysis. The data were normalized before further analysis. P value was calculated by Student´s two tailed t-test, Wilcoxon test or Fisher´s exact test, as indicated. A statistically significant difference was concluded when p value < 0.05 or < 0.001 as reported in the figure legends.

SUPPLEMENTAL REFERENCES Colaert, N., Helsens, K., Martens, L., Vandekerckhove, J., and Gevaert, K. (2009). Improved visualization of protein consensus sequences by iceLogo. Nature methods 6, 786-787. Cox, J., Hein, M.Y., Luber, C.A., Paron, I., Nagaraj, N., and Mann, M. (2014). Accurate proteomewide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Molecular & cellular proteomics : MCP 13, 2513-2526.

28

To be considered as a Resource Article in Cell Reports

Cox, J., and Mann, M. (2008). MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nature biotechnology 26, 1367-1372. Cox, J., Neuhauser, N., Michalski, A., Scheltema, R.A., Olsen, J.V., and Mann, M. (2011). Andromeda: a peptide search engine integrated into the MaxQuant environment. Journal of proteome research 10, 1794-1805. Davidson, B., and Trope, C.G. (2014). Ovarian cancer: diagnostic, biological and prognostic aspects. Womens Health (Lond) 10, 519-533. Davidson, B., Trope, C.G., and Reich, R. (2012). Epithelial-mesenchymal transition in ovarian carcinoma. Frontiers in oncology 2, 33. Francavilla, C., Rigbolt, K.T., Emdal, K.B., Carraro, G., Vernet, E., Bekker-Jensen, D.B., Streicher, W., Wikstrom, M., Sundstrom, M., Bellusci, S., et al. (2013). Functional proteomics defines the molecular switch underlying FGF receptor trafficking and cellular outputs. Molecular cell 51, 707722. Franceschini, A., Szklarczyk, D., Frankild, S., Kuhn, M., Simonovic, M., Roth, A., Lin, J., Minguez, P., Bork, P., von Mering, C., et al. (2013). STRING v9.1: protein-protein interaction networks, with increased coverage and integration. Nucleic acids research 41, D808-815. Huang da, W., Sherman, B.T., and Lempicki, R.A. (2009). Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nature protocols 4, 44-57. Jersie-Christensen, R.R., Sultan, A., and Olsen, J.V. (2016). Simple and Reproducible Sample Preparation for Single-Shot Phosphoproteomics with High Sensitivity. Methods Mol Biol 1355, 251-260.

29

To be considered as a Resource Article in Cell Reports

Kelstrup, C.D., Young, C., Lavallee, R., Nielsen, M.L., and Olsen, J.V. (2012). Optimized fast and sensitive acquisition methods for shotgun proteomics on a quadrupole orbitrap mass spectrometer. Journal of proteome research 11, 3487-3497. Kwiatkowski, N., Zhang, T., Rahl, P.B., Abraham, B.J., Reddy, J., Ficarro, S.B., Dastur, A., Amzallag, A., Ramaswamy, S., Tesar, B., et al. (2014). Targeting transcription regulation in cancer with a covalent CDK7 inhibitor. Nature 511, 616-620. Lawrence, M.S., Stojanov, P., Mermel, C.H., Robinson, J.T., Garraway, L.A., Golub, T.R., Meyerson, M., Gabriel, S.B., Lander, E.S., and Getz, G. (2014). Discovery and saturation analysis of cancer genes across 21 tumour types. Nature 505, 495-501. Olsen, J.V., Blagoev, B., Gnad, F., Macek, B., Kumar, C., Mortensen, P., and Mann, M. (2006). Global, in vivo, and site-specific phosphorylation dynamics in signaling networks. Cell 127, 635648. Papasaikas, P., Rao, A., Huggins, P., Valcarcel, J., and Lopez, A. (2015). Reconstruction of composite regulator-target splicing networks from high-throughput transcriptome data. BMC genomics 16 Suppl 10, S7. Pletscher-Frankild, S., Palleja, A., Tsafou, K., Binder, J.X., and Jensen, L.J. (2015). DISEASES: text mining and data integration of disease-gene associations. Methods 74, 83-89. Rossin, E.J., Lage, K., Raychaudhuri, S., Xavier, R.J., Tatar, D., Benita, Y., Cotsapas, C., and Daly, M.J. (2011). Proteins encoded in genomic regions associated with immune-mediated disease physically interact and suggest underlying biology. PLoS genetics 7, e1001273. Tyanova, S., Temu, T., Sinitcyn, P., Carlson, A., Hein, M.Y., Geiger, T., Mann, M., and Cox, J. (2016). The Perseus computational platform for comprehensive analysis of (prote)omics data. Nature methods.

30

To be considered as a Resource Article in Cell Reports

Wettenhall, J.M., and Smyth, G.K. (2004). limmaGUI: a graphical user interface for linear modeling of microarray data. Bioinformatics 20, 3705-3706. Zecchini, S., Bianchi, M., Colombo, N., Fasani, R., Goisis, G., Casadio, C., Viale, G., Liu, J., Herlyn, M., Godwin, A.K., et al. (2008). The differential role of L1 in ovarian carcinoma and normal ovarian surface epithelium. Cancer research 68, 1110-1118. Zhang, H., Liu, T., Zhang, Z., Payne, S.H., Zhang, B., McDermott, J.E., Zhou, J.Y., Petyuk, V.A., Chen, L., Ray, D., et al. (2016). Integrated Proteogenomic Characterization of Human High-Grade Serous Ovarian Cancer. Cell.

31