Bioinformatic identification of genes suppressing ... - Semantic Scholar

1 downloads 26 Views 1MB Size Report
Nov 5, 2012 - hydroxyurea (HU), mitomycin c, and tirapazamine]; osmotic stress. (cisplatin and mitomycin c); vesicle-mediated transport (bleomycin,. HU, and ...
PNAS PLUS

Bioinformatic identification of genes suppressing genome instability Christopher D. Putnama,b, Stephanie R. Allen-Solteroa,c, Sandra L. Martineza, Jason E. Chana,b, Tikvah K. Hayesa,1, and Richard D. Kolodnera,b,c,d,e,2 a Ludwig Institute for Cancer Research, Departments of bMedicine and cCellular and Molecular Medicine, dMoores-University of California at San Diego Cancer Center, and eInstitute of Genomic Medicine, University of California School of Medicine at San Diego, La Jolla, CA 92093

Unbiased forward genetic screens for mutations causing increased gross chromosomal rearrangement (GCR) rates in Saccharomyces cerevisiae are hampered by the difficulty in reliably using qualitative GCR assays to detect mutants with small but significantly increased GCR rates. We therefore developed a bioinformatic procedure using genome-wide functional genomics screens to identify and prioritize candidate GCR-suppressing genes on the basis of the shared drug sensitivity suppression and similar genetic interactions as known GCR suppressors. The number of known suppressors was increased from 75 to 110 by testing 87 predicted genes, which identified unanticipated pathways in this process. This analysis explicitly dealt with the lack of concordance among high-throughput datasets to increase the reliability of phenotypic predictions. Additionally, shared phenotypes in one assay were imperfect predictors for shared phenotypes in other assays, indicating that although genome-wide datasets can be useful in aggregate, caution and validation methods are required when deciphering biological functions via surrogate measures, including growth-based genetic interactions. DNA damage

| DNA repair | systems biology

G

enetic instability is a characteristic of most cancers (1) that may play a critical role in driving the accumulation of genetic changes that underlie tumorigenesis (2). A number of observations are consistent with this view, including the following: a number of cancer predisposition syndromes have been identified that are associated with inherited defects in genes involved in suppressing genome instability, and inactivation of some of these genes has been observed in sporadic cancers (3, 4); p53, which promotes cell cycle arrest or apoptosis in response to DNA damage, is inactivated in roughly 50% of human cancers, and p53 defects allow cells to tolerate the accumulation of genome rearrangements (5); and genomic instability has been observed to precede the transition to the carcinogenic state or to be associated with the development of cancers in mouse model systems (6). The investigation of model systems in the study of genome instability has the potential to identify and understand novel genes and pathways relevant to human cancer. A genetic assay developed in the yeast Saccharomyces cerevisiae has been used to identify genes and pathways that suppress gross chromosomal rearrangements (GCRs) mediated by single-copy DNA sequences (7). In this assay, selection against two genetic markers, CAN1 and URA3, placed on a nonessential end of the left arm of chromosome V selects for the loss of these two genes that results as a consequence of the formation of GCRs that delete the left arm of chromosome V. The types of GCRs that have been observed with this assay include terminal deletions healed by de novo telomere addition, translocations, isoduplications and other types of dicentric translocation chromosomes, interstitial deletions, circular chromosomes, and complex GCRs resulting from multiple cycles of rearrangement, usually as a result of the formation of unstable dicentric translocations (8–11). Using this assay, oxidative defense pathways, the replication machinery, DNA repair pathways, cell cycle checkpoint pathways, telomere maintenance pathways, and chromatin modification and assembly pathways have been shown to function

www.pnas.org/cgi/doi/10.1073/pnas.1216733109

in concert to prevent genome rearrangements (reviewed in 12). Modifications of the original GCR assay demonstrated that suppression of GCRs mediated by segmental duplications and Ty elements involves additional genes and pathways that do not suppress single-copy sequence-mediated GCRs (13–15). Interestingly, homologs of some GCR-suppressing genes and pathways suppress the development of cancer in mammals (16). Most of the genes that suppress GCRs have been identified through a candidate gene approach. Some studies have screened collections of arrayed S. cerevisiae mutants for mutations that cause increased GCR rates and have identified additional genes of interest (17–20), although the mutations identified in each screen only had a small overlap with each other. Consequently, it is probable that not all the genes and pathways that suppress GCRs have been identified. The promise of genome-wide protein–protein interaction, genetic interaction, and drug sensitivity datasets developed using S. cerevisiae is that these data can be used for predicting gene and gene product functions (e.g., ref. 21). Despite the fact that these datasets contain useful information, high-throughput methods are prone to both false-positive and false-negative errors. Consequently, different datasets generated using similar approaches to screen the same mutant collection show a substantial lack of concordance (22). Here, we show that combining these types of data identified additional genes involved in suppressing genome stability based on the hypothesis that these additional genes will share aspects of their phenotypes with known genes. Using these data, we have generated a set of 1,041 gene deletion mutations that have genetic interactions and drug sensitivity profiles matching those mutations known to affect the rate of accumulating GCRs; 787 of them are characterized by dense genetic interactions, and the remaining 254 have limited genetic interactions. To validate this approach, we investigated a subset of the predicted genes and found that deletions of 35 of the 87 genes selected from clusters containing known GCRsuppressing genes for analysis increased the rate of accumulating GCRs, which represents a 200-fold higher efficiency for identifying new GCR-suppressing genes compared with that seen in genomewide screens. This experimental validation identified genes that had not been previously implicated in suppressing GCRs and demonstrated that components of the nuclear pore, the proteasome, and the morphogenesis and septin checkpoint, as well as proper control of the anaphase-promoting complex/cyclosome (APC/C), play roles in suppressing GCRs. Thus, the resulting gene lists are enriched for

Author contributions: C.D.P. and R.D.K. designed research; C.D.P., S.R.A.-S., S.L.M., J.E.C., and T.K.H. performed research; C.D.P. contributed new reagents/analytic tools; C.D.P. and R.D.K. analyzed data; and C.D.P. and R.D.K. wrote the paper. The authors declare no conflict of interest. 1

Present address: Curriculum in Genetics and Molecular Biology and Lineberger Comprehensive Cancer Center, University of North Carolina, Chapel Hill, NC 27599.

2

To whom correspondence should be addressed. E-mail: [email protected].

See Author Summary on page 19055 (volume 109, number 47). This article contains supporting information online at www.pnas.org/lookup/suppl/doi:10. 1073/pnas.1216733109/-/DCSupplemental.

PNAS | Published online November 5, 2012 | E3251–E3259

GENETICS

Contributed by Richard D. Kolodner, September 28, 2012 (sent for review August 25, 2011)

genes that function to suppress genome stability. Importantly, our results indicate that identification of genes based on analysis of DNA damaging agent sensitivity and growth-based genetic interaction patterns was an imperfect predictor for identifying genes that suppress GCRs, which has important implications for attempts to reconstruct pathways by computationally combining data from systematic genetic and physical interaction studies. Results Bioinformatic Identification of Candidate Genome Stability Genes. Genes identified as suppressing genome rearrangements. To identify

candidate genes that suppress GCRs (Fig. 1), we first analyzed over 700 published GCR rates of strains with single or multiple mutations. This identified 75 mutations that increased GCR rates by fivefold or more as single mutants and/or caused synergistic increases in rate in combination with other mutations (SI Appendix, Table S1 and Dataset S1) and 40 mutations that did not increase GCR rates (SI Appendix, Table S2). The analysis considered the effect of all pair-wise interactions; for example, the GCR rate of the mre11 lig4 tlc1 triple-mutant strain was compared with the GCR rates of the pairs of strains mre11 and lig4 tlc1, lig4 and mre11 tlc1, and tlc1 and mre11 lig4. Interestingly, many of the mutations that increased the GCR rates also increased GCR rates synergistically in combination with other mutations, whereas many of the mutations tested that did not increase GCR rates suppressed the increased GCR rates caused by other mutations. Genes identified as suppressing sensitivity to DNA damaging agents.

Mutations in many of the 75 known GCR-suppressing genes caused increased sensitivity to DNA damaging agents (SI Appendix, Table S1). Therefore, we analyzed the results of 155 screens of the S. cerevisiae gene deletion collection against DNA damaging agents (SI Appendix, Table S3). Combined, 4,414 mutations (affecting over 90% of the nonessential genes in the S. cerevisiae genome) were reported to cause some level of increased sensitivity in at least one screen; this number was reduced to 4,143 mutations by treating deletions of dubious ORFs that overlapped verified genes as alleles of the verified genes (Fig. 2A and Dataset S2). The large number of reported mutations causing sensitivity to DNA damaging agents reflected the low reproducibility of different screens of the same damaging agents (Fig. 2E); hierarchal agglomerative clustering analysis grouped screens by laboratory rather than by DNA damaging agent, indicative of “batch effects” in these high-throughput datasets (22). Regardless, the most commonly identified mutations affected genes known to be involved in DNA repair (Fig. 2B). For example, mms4Δ, rad5Δ, mus81Δ, rad59Δ, and rad10Δ were iden-

31

814

44

114 34

183 10 related genes

186

Newly tested Previously known

60 Sensitivity 928 genes

148 genetically similar genes

50 40 30 20

0

776

Final 1041 candidate genes

GCR list

GCR and Sensitivty Related Sensitivity list list lists

Suppressors

0

Suppressors

72

0

70

10

10 0

B

Suppressors

227 genetically similar genes

962 merged DNA damaging agent sensitivity genes

Suppressors

Known 75 GCR genes

258 merged GCR suppressing genes

Number of Genes

A

tified in 119, 118, 106, 96, and 91 screens, respectively (Dataset S2). Over 60% of all mutations were identified in 5 or fewer screens, and 16% were observed in only 1 screen. Using random computer simulations (Materials and Methods), we calculated pnhit P values, which was the statistical significance of identifying a gene n times, and found that mutations identified eight times (n = 8) were significant (pnhit < 0.01). We also analyzed mutations identified a statistically significant number of times (pnhit < 0.01) that caused sensitivity to specific DNA damaging treatments using the program GOstat (23) to identify statistically significant gene ontology terms (24). This analysis primarily identified terms related to DNA repair, DNA damage signaling, chromatin, and chromosome organization and biogenesis (Dataset S3). Some unexpected pathways were also identified: ubiquitin-dependent protein catabolism of the multivesicular body pathway [2-dimethylaminoethyl chloride (DMAEC), hydroxyurea (HU), mitomycin c, and tirapazamine]; osmotic stress (cisplatin and mitomycin c); vesicle-mediated transport (bleomycin, HU, and oxaliplatin); peroxisome function [methylmethane sulfonate (MMS)]; and secretory pathways, membrane invagination, and glycoprotein biosynthesis (bleomycin). In contrast, genes associated with UV light and ionizing radiation (IR) resistance were predominantly associated with DNA repair, damage signaling, and chromatin remodeling. One implication of these results is that some pathways involved in resistance to chronic drug treatments but not UV or IR treatment might function by means of drug detoxification, drug export, or amelioration of damage to cellular components other than DNA. Because we were interested in common DNA damage responses, we developed a statistical test to identify mutations with biased distributions to screens of specific treatments and specific laboratories (Materials and Methods). We applied this test to all 4,143 mutations, which reduced the number to 1,446 mutations. Most mutations eliminated were observed in four or fewer screens (Fig. 2C). In addition, the test eliminated frequently observed mutations that were specific to a particular laboratory, such as yll032cΔ, rpl15bΔ, gal1Δ, and tma46Δ, which were observed 52, 50, 49, and 49 times, respectively, in a single laboratory or were specific to a particular damaging agent, including hxk2Δ, ybr242wΔ, ald6Δ, atg12Δ, and ylr064wΔ, which were observed 10, 9, 9, 8, and 8 times, respectively, almost exclusively in cisplatin sensitivity screens. Although the eliminated mutations had no obvious role in the DNA damage response, we tested 45 of these mutations, including the laboratory-specific examples cited above, for their affect on chronic exposure to HU, MMS, 4-nitroquinoline 1-oxide (4NQO), and/or

Fig. 1. Schematic of the bioinformatic scheme to enrich for genome stability genes. (A) Number of genes identified at each step is indicated. Venn diagrams contain gene counts and indicate merging steps. (B) Breakdown of genes that suppress and have no effect in suppressing GCRs as a function of if these genes were present in the list of genes suppressing GCRs, sensitivity to DNA damaging agents, or both, or were from the list of 10 related genes. Dark bars indicate genes whose roles in GCRs were tested here, and white bars indicate genes whose GCR status was previously known.

E3252 | www.pnas.org/cgi/doi/10.1073/pnas.1216733109

Putnam et al.

Mutations

Number of Mutations In Any p < 0.01 In All

200

Cisplatin Mechlorethamine MMS HU Mitomycin c Camptothecin Psoralen Bleomycin Carboplatin Angelicin Oxaliplatin 4NQO Melphalan Streptozotocin UVC DMAEC Doxorubicin IR other

33 22 12 11 11 8 7 6 6 5 5 4 4 4 3 2 2 2 8

1743 1398 2053 1350 2206 485 718 736 570 570 656 954 434 425 421 307 229 165 590

708 370 274 265 671 100 61 446 62 113 235 184 39 115 41 132 37 165 n.a.

0 2 0 0 9 0 0 23 0 1 6 30 0 11 8 0 37 19 0

All

155

4143

928

0

Mutations Mutations

Number of Screens

400

0 4

Mutations

D

E Treatment

B

C

All Mutations (n=4,143)

600

PNAS PLUS

A

All Mutations

2 0 2 4

DNA Repair Mutations

600

Common Mutations (n=1,743)

400 200 0 600

DNA Damaging Agent Mutations (n=928)

400 200 0

0

20

40

60

80

100

120

Number of Screens

camptothecin. Forty-four of the 45 mutations caused no drug sensitivity (P < 0.0001, hypergeometric test), whereas yll032cΔ caused weak MMS sensitivity. Retaining only those mutations identified in a significant number of screens (pnhit < 0.01) from the 1,446 mutations resulted in 928 mutations, which included 44 of 75 mutations increasing GCR rates (Fig. 2D and Dataset S1). Genes identified by genetic congruence. To find genes that had been missed but with related functions, we scored all the genes in the genome on the basis of their genetic similarity or “congruence” (25) with previously identified genes using reported growth-based genetic interactions. The growth-based genetic data also have imperfect concordance; the mean overlap for a reported subset of genetic interactions in S. cerevisiae by different groups has been estimated at less than 50% (26), potentially due to errors in scoring growth phenotypes, escape of diploids during haploid selection (27, 28), additional mutations present in strains in the deletion collection (29, 30), and/or the presence of an incorrect mutation due to cross-contamination, which we have tested for and corrected in our copy of the genome deletion collection. Therefore, our strategy to improve the robustness of this step was to score the genetic congruence of each candidate mutation using the combined genetic signature of the interactions of the 75 mutations causing increased GCR rates and the 928 mutations causing DNA damaging agent sensitivity with each gene in the rest of the genome (∼6,000 genes; SI Appendix, SI Materials and Methods). Congruence scores could range from 0 (no congruence) to 1 (complete congruence), and random simulations were performed to identify statistically significant congruence score cutoffs. Using the 75 GCR genes, the maximum congruence score was 0.115 for SRS2 (Dataset S4), and the cutoff of 0.040 (P < 0.01, random simulation) selected 227 genes, which included 44 starting genes and 183 new genes. Forty-two of the 44 recovered starting genes were reidentified by congruence selection even when the gene was removed from the initial list. Of the 227 GCR congruent Putnam et al.

genes, 71 suppressed GCR formation (61% of those tested) and 46 did not when including the experimental results described below, suggesting an enrichment for GCR-suppressing genes (P = 1 × 10−102 if the 110 GCR-suppressing genes identified previously and below are the only ones that exist and P = 2 × 10−20 if all 1,041 candidates identified here suppress GCRs, hypergeometric test). Merging the 75 starting genes and 227 congruent genes produced a merged GCR list of 258 genes (Fig. 1). Using the 928 DNA damaging agent genes, the maximum congruence score was 0.063 for SWR1 (Dataset S4) and the cutoff of 0.046 (P < 0.01, random simulation) selected 148 genes, which included 114 starting genes and 34 new genes. One hundred five of the 114 starting genes were identified even when removed from the initial list. Thirty-two of the 34 new genes were nonessential, 31 were previously identified in at least one screen, and deletion of 20 of these 32 nonessential new genes caused at least some sensitivity when tested against chronic exposure to HU, MMS, 4NQO, and/or camptothecin (P < 1 × 10−7, hypergeometric test; Dataset S5). Because most newly identified genes suppressed drug sensitivity, we generated a merged list of 962 genes from the starting genes and the genetically congruent genes (Fig. 1). Merging the 258 GCR genes and 962 DNA damaging agent genes implicated in this study generated a merged list of 1,031 genes (Fig. 1 and Dataset S1). One hundred eighty-nine genes were shared between the merged GCR gene and merged DNA damaging agent lists: 69 were unique to the merged GCR gene list, and 773 were unique to the merged DNA damaging agent list. Additionally, we noted 10 genes (RTT105, IRC15, IRC3, DOT1, DPB3, MLH1, NAS6, PAP2, UMP1, and VAC7) that fell below statistical cutoffs in our analysis but were related to and clustered with bona fide GCR-suppressing genes (see below), and we added them to the final list, resulting in a total of 1,041 genes. Robustness of the method. A computational test of the robustness of our method was performed by determining if the method could PNAS | Published online November 5, 2012 | E3253

GENETICS

Fig. 2. Analysis of DNA damaging agent treatments. (A) Histogram of the number of DNA damaging agent sensitivity mutations as a function of the number of screens in which each mutation was identified. (B) View of the histogram in A plotting all mutations above the axis and only those mutations known to affect the DNA damage response below the axis. (C) View of the histogram in A after filtering out agent- and laboratory-specific mutations. (D) View of the histogram in C after filtering out nonsignificant genes. (E) Summary table of treatments that have been screened multiple times and the number of mutations found in any screen, in a statistically significant number of screens (pnhit < 0.01), and in all screens. UVC, ultraviolet light in band C.

identify genes found in three different systematic screens using modified GCR assays to identify genes that suppress GCRs (17– 19) when those genes were removed from the original list of GCRsuppressing genes that anchored the analysis. This analysis recovered 7 of 8 (P < 0.0001, hypergeometric test), 8 of 11 (P < 0.0004), and 13 of 16 (P < 7 × 10−7) of the genes reported in these screens, respectively (SI Appendix, Table S4), although it should be noted that some of these genes that were not identified by our analysis only played small roles in suppressing GCRs and that many of the genes experimentally verified here were not identified by these screens (see below). Despite these differences, the robustness with which these genes from these screens were identified suggests that the final list is enriched in genes involved in preventing genome stability. Computational Analysis and Prioritization of Candidate Genome Stability Genes. The 1,041 genes implicated by this analysis were

large enough to be problematic for gene-by-gene validation. In addition, the identification of potential drug detoxification mechanisms suggested that not all these genes directly suppress genome instability. Thus, to prioritize the final list of 1,041 genes for subsequent experimental analysis, the genes were subjected to agglomerative hierarchical clustering analysis (Fig. 3 and Dataset S6) using congruence scores calculated from reported growth-based genetic interactions (SI Appendix, SI Materials and Methods). This analysis divided the list into 74 clusters (comprising 787 genes) with an additional “unclustered” group (comprising 254 genes) that contained those genes that did not cluster with other genes due to lack of shared genetic interactions (Fig. 4). Many clusters were enriched in genes involved in specific cellular functions. For example, cluster 1 was enriched in polarity determination and vesicle-mediated transport; cluster 2 was enriched in mitotic nuclear and chromosome migration; cluster 3 was enriched in chromatin modification and transcription; and cluster 4 was enriched in the DNA damage response, particularly those genes involved in double-strand break (DSB) repair (Dataset S6). Within each cluster, genes encoding protein complexes or belonging to known pathways tended to group together and to have few interactions with each other, consistent with these genes belonging to single epistasis groups. Furthermore, genetic interactions between genes within an individual cluster (Fig. 4A) were consistent with the presence of multiple epistasis groups. Together, these observations indicated that the clustering captured important aspects of at least some of the functions of these genes. Some biological functions were divided between multiple clusters. DNA damage response genes were divided between cluster 4 (Fig. 3) and cluster 32 (SI Appendix, Fig. S1), as well as the smaller clusters 53, 55, 59, and 60 (Dataset S6). Clusters 4 and 32 have high GCR congruence scores (Fig. 4B) and moderate DNA damaging agent congruence scores (Fig. 4C), and they contain many of the genes implicated in suppressing sensitivity to many different DNA damaging agents (Fig. 4D) and in playing important roles in suppressing GCRs (Fig. 4E). Examination of the interactions of these two clusters suggests that the major reason the clustering algorithm split these genes into two clusters was that cluster 32 had fewer interactions with cluster 3 (chromatin modification) than cluster 4 did. In contrast, genes biased toward interactions with cluster 32 but not with cluster 4 did not define clear pathways or groups. Remarkably, genes involved in a number of well-characterized pathways, such as base-excision repair, nucleotide-excision repair, and mismatch repair, tended not to be present in either cluster 4 or 32 (Fig. 3 and SI Appendix, Fig. S1), and genes from these pathways were frequently divided between multiple clusters. The lack of clustering of these genes is consistent with their general paucity of genetic interactions relative to DSB repair genes in the absence of DNA damaging agents (Fig. 4 A and B and Dataset S5). However, the importance of these genes in the presence of DNA damaging agents is emphasized by the number of screens in which these E3254 | www.pnas.org/cgi/doi/10.1073/pnas.1216733109

genes were identified (Datasets S2 and S6) and is consistent with the known roles of these gene products. We note that the inability of unperturbed growth to capture the roles of these types of genes can be anticipated from decades of classic genetic studies as well as a recent report of changes in genetic interactions measured in a high-throughput manner due to the presence of MMS (31). Experimental Validation of the Enrichment of Genome Stability Genes. We selected a subset of 87 genes from the final list of 1,041

genes to analyze their potential roles in suppressing GCRs. Given that some clusters might be more important for drug detoxification or export than genome stability per se, these genes were primarily selected from clusters that contained known GCR-suppressing genes. None of these genes had been tested for a role in suppressing GCRs at the time this analysis was initiated, although, subsequently, the results of studies of some of the selected genes have been reported by others. We also surveyed genes from a number of other clusters. Overall, we tested the effects of 87 different single-gene deletion mutations in our standard GCR assay that measures GCRs mediated by single-copy sequences (Table 1 and SI Appendix, Table S5) and found that 35 (40%) caused at least a modest but significant threefold or higher increase in the spontaneous GCR rate, which suggests a substantial enrichment for genes that suppress genome instability in the higher scoring clusters generated by the bioinformatic analysis. The presence of a newly tested gene in clusters 4, 32, 53, 55, 59, and 60 did not have a statistically significant bias for suppressing GCRs (P = 0.2, Fisher exact probability), and the presence of new genes in these clusters did not correlate with a higher GCR rate (P = 0.3, Mann–Whitney U test). The newly identified genes that function in the suppression of genome instability could be divided into three classes. The first class of genes encoded subunits of complexes or components of pathways already known to be involved in maintaining genome stability. These included RMI1, which encodes a subunit of the Sgs1/Rmi1/Top3 complex (34); SAE2, which encodes a factor that acts in conjunction with the Mre11/Rad50/Xrs2 complex (35); RTT109, which is involved in the ASF1-dependent acetylation of K56 of histone H3 (36); and HST3, ARP8, RSC2, and HTA1, which function in chromatin assembly and remodeling pathways, processes known to prevent genome instability (37). The second class of genes has been implicated by other analyses as suppressing genome stability but had never been analyzed in the GCR assay at the time this analysis was initiated. This group of genes included CDC73, CLB2, CLB5, CSM3, DOT1, ESC2, MMS1, MRC1, MPH1, NUP84, NUP133, NUP60, REV7, RTT107 (ESC4), SLX5, SLX8, and TOF1. The third class of genes lacks known functions or has not previously known to play a role in maintaining genome stability. These genes include CDH1, CTF4, DST1, HSL1, IRC3, IRC15, LRS4, PIN4, RTT105, RML2, and RPN10. Unlike the case of RPN10, which encodes a proteasome subunit (38), deletion of the nonessential proteasome-related genes NAS6, RPN4, and UMP1 did not increase the GCR rate (SI Appendix, Table S5). In contrast, 52 identified genes did not increase rates in our standard GCR assay when mutated (SI Appendix, Table S5). A number of these genes were from clusters 4 and 32, which contained many genes that caused substantially increased GCR rates when mutated (Fig. 4E). Defects in some of these genes were previously reported to show significant numbers of genetic interactions with defects in DNA repair, including the genes encoding the Ard1-Nat1 N-terminal acetyltransferase complex (39) and the Get1–Get2 complex involved in transporting tail-anchored proteins to the endoplasmic reticulum (40). Moreover, a number of other genes that were previously implicated as functioning in DNA repair and DNA damage responses, including CCR4, WSS1, PPH3, DOA1, and CSM1, did not appear to act in suppressing GCRs. Although it is possible that these genes play no role in maintaining genome stability, it is also possible that they suppress GCRs not detected by our standard GCR assay. Multiple genes, Putnam et al.

PNAS PLUS

Other

Bleomycin IR

Camptothecin

HU

Mitomycin c

Angelicin

Psoralen DMAEC 4NQO UV

MMS

Streptozotocin

Mechlorethamine

LOH Oxaliplatin

CST

Cisplatin

Ty

Carboplatin

TL

GCR Rate GCR GCR Similar Drug Drug Similar IRC AGA+ G+ Ty1 Ty3 Ty1 + Ty3 + CST ALF BiM CTF MET15 SAM2 MAT

Inclusion

GENETICS

CSM3 MRC1 TOF1 CTF8 CTF18 DCC1 CTF4 POL32 RAD27 ASF1 MMS22 DIA2 RAD50 XRS2 MRE11 RAD52 RAD51 RAD54 RAD55 RAD57 SGS1 SRS2 RTT107 MMS1 RTT101 ELG1 MUS81 RAD53 RAD18 RAD5 TSA1 DUN1 CCS1 SOD1 SWI6 SLX5 SLX8 RTT109 NAT1 ARD1 POP2 CCR4 CDC20 YNG2 EPL1 ARP4 ESA1 EAF1 SIC1 Fig. 3. Annotated genes from cluster 4. The GCR rate column identifies mutations tested in the GCR assay: Circles were previously tested, squares were tested in this study, crosses were essential genes, solid symbols increased GCR rates as single mutants, half filled-in symbols only synergistically increased GCR rates in combination with other mutants, and open symbols did not increase GCR rates. “Inclusion” indicates if a gene was identified in the GCR rate (GCR Rate), genetic congruence to GCR genes (GCR Similar), DNA damaging agent (Drug), or genetic congruence to DNA damaging agent genes (Drug Similar) stage of the bioinformatics analysis. “IRC” indicates those genes causing increased recombination centers (48). “TL” indicates mutations identified in two telomerelength screens by Askree et al. (60) and Gatbonton et al. (61), with decreased (A−, G−) or increased (A+, G+) telomere lengths. “Ty” indicates mutations causing decreased (Ty1−, Ty3−) or increased (Ty1+, Ty3+) transposition (49, 62, 63). “CST” indicates mutations identified as affecting chromosome stability by several assays (64, 65). LOH indicates mutations increasing loss of heterozygosity by several assays (66). Sensitivity to each DNA damaging agent is indicated by vertical bars, with different treatments having alternate colors.

Putnam et al.

PNAS | Published online November 5, 2012 | E3255

Cluster Number

A

4

32

Unclustered

53 55 59 60

Cluster Number

4

32

53 55 59 60

D

E

GCR Rate

C

Drug Number of Congruence Drug Screens (x102)

GCR Congruenc (x102)

B

10 8 6 4 2 0 6 5 4 3 2 1 0 100 80 60 40 20 0

10-7 10-8 10-9 10-10 0

200

400

600

800

1000

1041

Gene Number

Fig. 4. Overview of the clustering of the bioinformatically identified genes. (A) Binary interaction map showing the presence (black) or absence (white) of genetic interactions (Materials and Methods) between all 1,041 genes in the 74 clusters and the nonclustered group (horizontal) and 787 genes in the 74 clusters (vertical). (B) Genetic congruence score for each of the 1,041 genes with the GCR genes. Boundaries for clusters 4 and 32 are shown as vertical lines. (C) Genetic congruence score with the genes suppressing sensitivity to DNA damaging agents. (D) Number of DNA damaging agents screens in which different deletions of the 1,041 genes were identified. (E) GCR rates of single-gene deletion mutants. Genes with rates listed as “Low” in SI Appendix, Table S5 were arbitrarily assigned the WT GCR rate (3.5 × 10−10) for display purposes.

including RAD6, do not suppress GCRs in the standard assay but do in other GCR assays (13, 14). Other genes, such as MRC1 and TEL1, play redundant roles in suppressing GCRs, whose role can only be observed when combined with other mutations (41) (Table 1), whereas other genes do not increase GCRs because they are required for producing GCRs. Reiterating the Analysis with Newly Identified GCR Suppressors. The 35 newly validated GCR-suppressing genes from this analysis (see below) were combined with the initial list of 75 GCR-suppressing genes to generate a starting list of 110 GCR-suppressing genes. This newly identified set of starting genes was then reanalyzed by our bioinformatics pipeline. Two hundred twenty-three genes, rather than 227 genes from the original analysis, were identified as having statistically significant genetic congruence scores (score >0.046; P < 0.01), which included 67 of the 110 starting genes. Compared with E3256 | www.pnas.org/cgi/doi/10.1073/pnas.1216733109

the original analysis, 18 genes were omitted (CDC6, MSH2, PBY1, PEP3, PPM1, RDH54, RFA1, RNH201, RPN6, SAE2, SGO1, SLK19, SPT4, SUM1, THP2, UBP14, ULP1, and VPS36), and 14 genes were included (ASC1, CSE2, EAF5, HOS2, MNN10, NUP188, PHO23, RTT103, SEC22, SIF1, SIN3, SNF4, SRC1, and YKE2). Mutations in PBY1, SGO1, and SPT4, which were eliminated from the list, do not increase the GCR rate (SI Appendix, Table S5). Mutations in MSH2, RDH54, RFA1, and SAE2, which were also eliminated, cause only modest increases in the GCR rate as single mutations (msh2Δ, rdh54Δ, and sae2Δ; Table 1 and SI Appendix, Table S1), are complicated by their causing increased rates of point mutations in addition to GCRs (msh2Δ) (32), or are complicated by the existence of different hypomorphic alleles that cause different phenotypes (rfa1) (33). However, all these mutations were retained in this second analysis through their presence in the initial GCR list and/or by effects on sensitivity to DNA Putnam et al.

PNAS PLUS

Table 1. GCR rates of genome instability mutants implicated by bioinformatic analysis Genotype*

Systematic name

Strain

Cluster

No. of DNA damaging screens

WT esc2::HIS3 rmi1::HIS3 mrc1::TRP1, tof1::HIS3 slx8::HIS3 slx5::HIS3 cdh1::HIS3 nup84::HIS3 rtt107::HIS3 rpn10::HIS3 rsc2::G418 mms1::HIS3 nup133::HIS3 rtt105::HIS3 dst1::G418 arp8::HIS3 irc15::HIS3 nup60::HIS3 irc3::HIS3 csm3::G418 clb5::G418 rml2::HIS3 hsl1::HIS3 tof1::HIS3 mph1::G418 pin4::G418 rev7::G418 sae2::HIS3 hst3::HIS3 ctf4::G418 dot1::HPH hta1::HIS3 rtt109::HIS3 cdc73::HIS3 lrs4::G418 clb2::G418

— ydr363w ypl024w ycl061c, ynl273w yer116c ydl013w ygl003c ydl116w yhr154w yhr200w ylr357w ypr164w ykr082w yer104w ygl043w yor141c ypl017c yar002w ydr332w ymr048w ypr120c yel050c ykl101w ynl273w yir002c ybl051c yil139c ygl175c yor025w ypr135w ydr440w ydr225w yll002w ylr418c ydr439w ypr119w

RDKY3615 RDKY7030 RDKY6242 RDKY7032 RDKY7527 RDKY7524 RDKY6485 RDKY6195 RDKY7031 RDKY6216 RDKY6006 RDKY6206 RDKY6476 RDKY6673 RDKY7023 RDKY5949 RDKY7024 RDKY6489 RDKY7467 RDKY5708 RDKY7458 RDKY7069 RDKY6487 RDKY5135 RDKY7026 RDKY7476 RDKY7483 RDKY6234 RDKY6060 RDKY6018 RDKY7021 RDKY6490 RDKY6226 RDKY6410 RDKY7470 RDKY7456

n.a. 32 32 4, 4 4 4 21 32 4 3 11 4 7 32 3 32 29 7 29 4 32 12 29 4 34 15 60 55 14 4 16 15 4 3 2 29

n.a. 11 6 37, 33 11 9 4 9 23 16 11 32 9 0 8 21 0 11 0 46 17 8 20 33 60 16 28 62 26 11 2 13 16 9 29 45

Rate† 3.5 9.0 6.0 2.6 2.6 2.3 2.1 1.6 9.4 9.0 4.8 3.9 3.7 3.3 3.0 2.9 2.8 2.6 2.4 2.2 2.2 2.2 1.9 1.6 1.6 1.6 1.5 1.4 1.4 1.4 1.4 1.4 1.4 1.3 1.2 1.2

× × × × × × × × × × × × × × × × × × × × × × × × × × × × × × × × × × × ×

10−10 (1) 10−8 (257) 10−8 (189) 10−8 (75) 10−8 (75) 10−8 (66) 10−8 (58) 10−8 (44) 10−9 (27) 10−9 (26) 10−9 (13) 10−9 (11) 10−9 (10) 10−9 (9.4) 10−9 (8.6) 10−9 (8.4) 10−9 (8.0) 10−9 (7.4) 10−9 (6.9) 10−9 (6.3) 10−9 (6.3) 10−9 (6.3) 10−9 (5.4) 10−9 (4.6) 10−9 (4.6) 10−9 (4.6) 10−9 (4.3) 10−9 (4.0) 10−9 (4.0) 10−9 (4.0) 10−9 (4.0) 10−9 (4.0) 10−9 (4.0) 10−9 (3.7) 10−9 (3.4) 10−9 (3.4)

n.a., not applicable. *Deletions constructed in RDKY3615 [MATα leu2Δ1 his3Δ200 trp1Δ63 ura3-52 ade2Δ1 ade8 lys2ΔBgl hom3-10 hxt13::URA3]. Number in parentheses corresponds to fold increase in rate over the wild-type rate.

damaging agents. Thus, these results suggest that adding more data will further refine the results. Discussion Systematic genetics using the S. cerevisiae deletion and hypomorphic allele collections has been well established. However, the ability to screen these mutants readily for complex phenotypes or phenotypes requiring involved quantitative assays, such as GCR assays, can be difficult and subject to significant error. Thus, we designed a bioinformatic protocol for identifying unanticipated genes involved in suppressing GCRs, which involved handling numerous genome-wide datasets affected by both false-positive and false-negative errors. This analysis identified genes that were successfully enriched for genes involved in genome stability, as evidenced by independent identification of most previously known genes (17–19) and by the experimental validation of 40% of 87 identified genes that were tested for a role in suppressing GCRs (Table 1 and SI Appendix, Table S5), including genes in unexpected pathways. Our analysis of these 87 genes resulted in the identification of more GCR suppressing genes than resulted from three genome-wide screens involving the analysis of more than 14,000 mutants, which is consistent with a 200-fold enrichment in GCR-suppressing genes relative to the whole-genome screens. In Putnam et al.

addition, the observed gene validation frequency is likely to be higher than reported here because many mutations only cause increased GCR rates in conjunction with other mutations, such as tel1, or in segmental duplication GCR assays, such as rad6 (13, 41). A critical next step in our analysis is to test these mutations in multiple GCR assays that probe different chromosomal features and combine these mutations with other mutations. Importantly, this approach is generally applicable; these methods allowed the analysis to be performed multiple times as more data have become available, and the nature of the starting set of well-characterized genes and genome-wide screens need not be tied to the problem of genome stability and can readily accommodate RNAi data generated in mammalian systems. Experimental verification of the implicated genes revealed a number of interesting pathways. Genome instability was increased by deletion of genes involved in synchronizing multiple phases of the cell cycle, including CLB2, CLB5, HSL1, PIN4, and particularly CDH1, which encodes a subunit of the APC/C complex that degrades proteins during mitosis and G1, this role for CDH1 is consistent with observations in vertebrates (42). Genes encoding two different subcomplexes of the nuclear pore, the Nup84 complex (NUP84, NUP133, NUP120, NUP145, NUP85, SEH1, and SEC13) (43) and NUP60, suppressed genome instability; studies PNAS | Published online November 5, 2012 | E3257

GENETICS



performed while this work was in progress suggest a role in suppressing accumulation of DSBs via sumoylation of DNA repair enzymes (44) and direct recruitment of DSBs to the nuclear pore (45). We also identified genes that may suppress GCRs by indirectly aiding DNA replication, including CTF4, which may link DNA synthesis to sister chromatid cohesion, and DST1, which potentially reduces collisions between RNA and DNA polymerases. In addition, we found a role for RPN10, which encodes a non-ATPase base subunit of the 19S regulatory particle of the 26S proteasome, in suppressing genome instability, suggesting that the proteasome may play roles in genome instability outside of nucleotide excision repair (38), consistent with recent reports linking the proteasome to DSB repair in S. cerevisiae (46) and vertebrates (47). Interestingly, deletion of other genes related to the proteasome, including DOA1, NAS6, UMP1, UBP6, and especially RPN4, which encodes a transcription factor that stimulates proteasome gene expression and has a similar genetic interaction profile to RPN10, did not cause increased GCR rates. Thus, the defect in rpn10Δ strains might involve a specific feature or function of the proteasome (or the regulatory particle) that is not affected by eliminating other nonessential proteasome components. Taken together, this bioinformatics procedure has successfully identified tested components of genome stability pathways, untested components of tested genome stability pathways, untested genome stability pathways, and genes in other pathways that are beginning to be implicated in suppressing genome instability. These successes encourage further characterization of genes whose roles in suppressing genome instability might currently be less clear, including IRC3, IRC15, RML2, and RTT105 (48, 49). This bioinformatic scheme rested on three assumptions: (i) Systematically generated genome-wide data are of sufficient quality to be useful, (ii) novel genes that suppress GCRs share some phenotypes with known genes that suppress GCRs, and (iii) genetic interactions reported on the basis of change in nonperturbed growth provide a reasonable surrogate for other biological processes. The above assumptions are sufficiently true that combining these independent sources of information yielded unexpected genes of interest that were validated at high frequency. The most problematic assumption, however, was that genetic interactions based on growth phenotypes were a reasonable measure of similarity for roles in suppressing genome instability. One of the stronger counterexamples that can be cited is the observation that deletion of 12 of 31 tested genes in a high-scoring DNA damage cluster (cluster 4) did not cause increased GCR rates as single mutations. Sufficient genetic data exist for genes in cluster 4 to suggest that nonperturbed growth-based genetic interactions are only a crude surrogate for measuring similarity in suppressing GCRs, which is consistent with the substantial changes in synthetic lethal interactions between deletion mutations caused DNA damaging agents (31). Additionally, because only pair-wise interactions are typically identified, other kinds of important genetic results cannot be identified, such as suppression of the lethality of srs2Δ sgs1Δ double mutants by mutations causing homologous recombination defects (50), and because more complex genetic redundancies, which are particularly important in higher eukaryotes, cannot be handled. Together these factors argue that although these data can be extraordinarily useful in aggregate as we have demonstrated here, caution is called for in any attempt to use these kinds of data exclusively to derive biological pathways

de novo. This is particularly true when growth is used as a surrogate marker for measuring a specific phenotype, because growth defects may not be directly related to the phenotype of interest. We are presently implementing an approach in which systematically generated double-mutant strains designed to query the enriched gene lists described here will be analyzed using multiple GCR assays to define better the pathways that suppress GCRs implied by the bioinformatic analysis presented here. We anticipate that human orthologs of verified GCR genes identified here will also play roles in suppressing genome instability and may be important for suppressing cancer initiation and progression.

1. Hanahan D, Weinberg RA (2000) The hallmarks of cancer. Cell 100(1):57–70. 2. Loeb LA (2001) A mutator phenotype in cancer. Cancer Res 61(8):3230–3239. 3. Hoeijmakers JH (2001) Genome maintenance mechanisms for preventing cancer. Nature 411(6835):366–374. 4. Vessey CJ, Norbury CJ, Hickson ID (1999) Genetic disorders associated with cancer predisposition and genomic instability. Prog Nucleic Acid Res Mol Biol 63:189–221. 5. Soussi T, Ishioka C, Claustres M, Béroud C (2006) Locus-specific mutation databases: Pitfalls and good practice based on the p53 experience. Nat Rev Cancer 6(1):83–90.

6. van de Wetering CI, Horne MC, Knudson CM (2007) Chromosomal instability and supernumerary centrosomes represent precursor defects in a mouse model of T-cell lymphoma. Cancer Res 67(17):8081–8088. 7. Chen C, Kolodner RD (1999) Gross chromosomal rearrangements in Saccharomyces cerevisiae replication and recombination defective mutants. Nat Genet 23(1): 81–85. 8. Pennaneach V, Kolodner RD (2004) Recombination and the Tel1 and Mec1 checkpoints differentially effect genome rearrangements driven by telomere dysfunction in yeast. Nat Genet 36(6):612–617.

E3258 | www.pnas.org/cgi/doi/10.1073/pnas.1216733109

Materials and Methods Bioinformatic Analysis. The bioinformatic analysis described here has been implemented in the integration of multiple orthogonal datasets (IMOD) program package. IMOD and associated documentation and data files are available at http://sourceforge.net/projects/imod-gene. IMOD consists of command-line programs and shell scripts. IMOD readily compiles and runs in UNIX (uniplexed information and computing service) system-like operating systems. A detailed description of the methods is provided in SI Appendix, SI Materials and Methods. Analysis of DNA damaging agent sensitivities. Mutations deemed as causing sensitivity to different DNA damaging agent treatments were included based on the recommendations of the authors of the individual studies (SI Appendix, Table S3). Deletions of genes deemed “dubious ORFs” by the Saccharomyces Genome Database that overlapped validated genes were treated as mutant alleles of the validated genes; for example, ybr099cΔ was treated as an mms4 mutation. The full list of overlaps used is available as part of the data distributed with the IMOD software. The pnhit P values for observing a mutation in n of the N DNA damaging screens were calculated using probabilities from 1,000,000 random simulations (SI Appendix, SI Materials and Methods). Determination if the distribution of any particular mutation was significantly biased toward a group of screens, such as those belonging to a specific laboratory or a specific DNA damaging agent, was calculated by a ratio test of likelihoods (SI Appendix, SI Materials and Methods). Calculation of genetic distance and genetic congruence. Growth-based genetic interactions were measured using a modified BioGRID database derived from version 2.0.60 (51), including the interaction categories “synthetic lethality,” “synthetic growth defect,” and “haploinsufficiency,” as well as “phenotypic enhancement” data specifically derived from E-MAP studies (52–55). We also added 8,102 and 191,890 E-MAP interactions from additional studies published during the course of this analysis (56, 57). The interaction data were used to calculate genetic distances via the composite angle distance, which is similar to the Jaccard distance (58) but has a number of advantages for analysis of multiple genes (SI Appendix, SI Materials and Methods). We scored genetic congruence of each gene in the genome against the list of genes of interest using the composite angle distance and performed over 100,000 random simulations to calculate P values (SI Appendix, SI Materials and Methods). Clustering. Genes were clustered on the basis of their genetic congruence using agglomerative hierarchical clustering (59) (SI Appendix, SI Materials and Methods). Yeast Genetics. S. cerevisiae strains were constructed in the RDKY3615 background (MATa leu2Δ1 his3Δ200 trp1Δ63 lys2ΔBgl hom3-10 ade2Δ1 ade8 ura3-52 hxt13::URA3) using standard PCR-based mutagenesis methods. The media and protocol for strain propagation and measuring GCR rates were essentially as described previously (7). ACKNOWLEDGMENTS. We thank Hans Hombauer, Jorritt Ensernik, Vincent Pennaneach, Ellen Kats, and Kyungjae Myung for the generous gift of S. cerevisiae strains. This work was supported by National Institutes of Health Grants GM26017 and GM085764.

Putnam et al.

Putnam et al.

PNAS | Published online November 5, 2012 | E3259

PNAS PLUS

37. Myung K, Pennaneach V, Kats ES, Kolodner RD (2003) Saccharomyces cerevisiae chromatin-assembly factors that act during DNA replication function in the maintenance of genome stability. Proc Natl Acad Sci USA 100(11):6640–6645. 38. Reed SH, Gillette TG (2007) Nucleotide excision repair and the ubiquitin proteasome pathway—Do all roads lead to Rome? DNA Repair (Amst) 6(2):149–156. 39. Park EC, Szostak JW (1992) ARD1 and NAT1 proteins form a complex that has Nterminal acetyltransferase activity. EMBO J 11(6):2087–2093. 40. Schuldiner M, et al. (2008) The GET complex mediates insertion of tail-anchored proteins into the ER membrane. Cell 134(4):634–645. 41. Myung K, Datta A, Kolodner RD (2001) Suppression of spontaneous chromosomal rearrangements by S phase checkpoint functions in Saccharomyces cerevisiae. Cell 104 (3):397–408. 42. García-Higuera I, et al. (2008) Genomic stability and tumour suppression by the APC/C cofactor Cdh1. Nat Cell Biol 10(7):802–811. 43. Lutzmann M, Kunze R, Buerer A, Aebi U, Hurt E (2002) Modular self-assembly of a Yshaped multiprotein complex from seven nucleoporins. EMBO J 21(3):387–397. 44. Palancade B, et al. (2007) Nucleoporins prevent DNA damage accumulation by modulating Ulp1-dependent sumoylation processes. Mol Biol Cell 18(8):2912–2923. 45. Nagai S, et al. (2008) Functional targeting of DNA damage to a nuclear poreassociated SUMO-dependent ubiquitin ligase. Science 322(5901):597–602. 46. Ben-Aroya S, et al. (2010) Proteasome nuclear activity affects chromosome stability by controlling the turnover of Mms22, a protein important for DNA repair. PLoS Genet 6 (2):e1000852. 47. Motegi A, Murakawa Y, Takeda S (2009) The vital link between the ubiquitinproteasome pathway and DNA repair: impact on cancer therapy. Cancer Lett 283(1): 1–9. 48. Alvaro D, Lisby M, Rothstein R (2007) Genome-wide analysis of Rad52 foci reveals diverse mechanisms impacting recombination. PLoS Genet 3(12):e228. 49. Scholes DT, Banerjee M, Bowen B, Curcio MJ (2001) Multiple regulators of Ty1 transposition in Saccharomyces cerevisiae have conserved roles in genome maintenance. Genetics 159(4):1449–1465. 50. Gangloff S, Soustelle C, Fabre F (2000) Homologous recombination is responsible for cell death in the absence of the Sgs1 and Srs2 helicases. Nat Genet 25(2):192–194. 51. Stark C, et al. (2006) BioGRID: A general repository for interaction datasets. Nucleic Acids Res 34(Database issue):D535–D539. 52. Collins SR, et al. (2007) Functional dissection of protein complexes involved in yeast chromosome biology using a genetic interaction map. Nature 446(7137):806–810. 53. Jessulat M, et al. (2008) Interacting proteins Rtt109 and Vps75 affect the efficiency of non-homologous end-joining in Saccharomyces cerevisiae. Arch Biochem Biophys 469 (2):157–164. 54. Schuldiner M, et al. (2005) Exploration of the function and organization of the yeast early secretory pathway through an epistatic miniarray profile. Cell 123(3):507–519. 55. Wilmes GM, et al. (2008) A genetic interaction map of RNA-processing factors reveals links between Sem1/Dss1-containing complexes and mRNA export and splicing. Mol Cell 32(5):735–746. 56. Fiedler D, et al. (2009) Functional organization of the S. cerevisiae phosphorylation network. Cell 136(5):952–963. 57. Costanzo M, et al. (2010) The genetic landscape of a cell. Science 327(5964):425–431. 58. Jaccard P (1901) Distribution de la flore alpine dans le bassin des Dranses et dans quelques régions voisines. Bull Soc Vaud Sci Nat, 37:241–272, French. 59. Xu R, Wunsch D, 2nd (2005) Survey of clustering algorithms. IEEE Trans Neural Netw 16(3):645–678. 60. Askree SH, et al. (2004) A genome-wide screen for Saccharomyces cerevisiae deletion mutants that affect telomere length. Proc Natl Acad Sci USA 101(23):8658–8663. 61. Gatbonton T, et al. (2006) Telomere length as a quantitative trait: Genome-wide survey and genetic mapping of telomere length-control genes in yeast. PLoS Genet 2 (3):e35. 62. Griffith JL, et al. (2003) Functional genomics reveals relationships between the retrovirus-like Ty1 element and its host Saccharomyces cerevisiae. Genetics 164(3): 867–879. 63. Irwin B, et al. (2005) Retroviruses and yeast retrotransposons use overlapping sets of host genes. Genome Res 15(5):641–654. 64. Ouspenski II, Elledge SJ, Brinkley BR (1999) New yeast genes important for chromosome integrity and segregation identified by dosage effects on genome stability. Nucleic Acids Res 27(15):3001–3008. 65. Yuen KW, et al. (2007) Systematic genome instability screens in yeast and their potential relevance to cancer. Proc Natl Acad Sci USA 104(10):3925–3930. 66. Andersen MP, Nelson ZW, Hetrick ED, Gottschling DE (2008) A genetic screen for increased loss of heterozygosity in Saccharomyces cerevisiae. Genetics 179(3):1179– 1195.

GENETICS

9. Putnam CD, Pennaneach V, Kolodner RD (2004) Chromosome healing through terminal deletions generated by de novo telomere additions in Saccharomyces cerevisiae. Proc Natl Acad Sci USA 101(36):13262–13267. 10. Putnam CD, Pennaneach V, Kolodner RD (2005) Saccharomyces cerevisiae as a model system to define the chromosomal instability phenotype. Mol Cell Biol 25(16): 7226–7238. 11. Pennaneach V, Kolodner RD (2009) Stabilization of dicentric translocations through secondary rearrangements mediated by multiple mechanisms in S. cerevisiae. PLoS ONE 4(7):e6389. 12. Kolodner RD, Putnam CD, Myung K (2002) Maintenance of genome stability in Saccharomyces cerevisiae. Science 297(5581):552–557. 13. Putnam CD, Hayes TK, Kolodner RD (2009) Specific pathways prevent duplicationmediated genome rearrangements. Nature 460(7258):984–989. 14. Putnam CD, Hayes TK, Kolodner RD (2010) Post-replication repair suppresses duplication-mediated genome instability. PLoS Genet 6(5):e1000933. 15. Chan JE, Kolodner RD (2011) A genetic and structural study of genome rearrangements mediated by high copy repeat Ty1 elements. PLoS Genet 7(5): e1002089. 16. Wang Y, et al. (2005) Mutation in Rpa1 results in defective DNA double-strand break repair, chromosomal instability and cancer in mice. Nat Genet 37(7):750–755. 17. Huang ME, Rio AG, Nicolas A, Kolodner RD (2003) A genomewide screen in Saccharomyces cerevisiae for genes that suppress the accumulation of mutations. Proc Natl Acad Sci USA 100(20):11529–11534. 18. Smith S, et al. (2004) Mutator genes for suppression of gross chromosomal rearrangements identified by a genome-wide screening in Saccharomyces cerevisiae. Proc Natl Acad Sci USA 101(24):9039–9044. 19. Kanellis P, et al. (2007) A screen for suppressors of gross chromosomal rearrangements identifies a conserved role for PLP in preventing DNA lesions. PLoS Genet 3(8):e134. 20. Stirling PC, et al. (2011) The complete spectrum of yeast chromosome instability genes identifies candidate CIN cancer genes and functional roles for ASTRA complex components. PLoS Genet 7(4):e1002057. 21. Jordan PW, Klein F, Leach DR (2007) Novel roles for selected genes in meiotic DNA processing. PLoS Genet 3(12):e222. 22. Leek JT, et al. (2010) Tackling the widespread and critical impact of batch effects in high-throughput data. Nat Rev Genet 11(10):733–739. 23. Beissbarth T, Speed TP (2004) GOstat: Find statistically overrepresented Gene Ontologies within a group of genes. Bioinformatics 20(9):1464–1465. 24. Ashburner M, et al.; The Gene Ontology Consortium (2000) Gene ontology: Tool for the unification of biology. Nat Genet 25(1):25–29. 25. Ye P, et al. (2005) Gene function prediction from congruent synthetic lethal interactions in yeast. Mol Syst Biol 1:2005.0026. 26. Tischler J, Lehner B, Fraser AG (2008) Evolutionary plasticity of genetic interaction networks. Nat Genet 40(4):390–391. 27. Daniel JA, Yoo J, Bettinger BT, Amberg DC, Burke DJ (2006) Eliminating gene conversion improves high-throughput genetics in Saccharomyces cerevisiae. Genetics 172(1):709–711. 28. Singh I, Pass R, Togay SO, Rodgers JW, Hartman JL, 4th (2009) Stringent mating-typeregulated auxotrophy increases the accuracy of systematic genetic interaction screens with Saccharomyces cerevisiae mutant arrays. Genetics 181(1):289–300. 29. Lehner KR, Stone MM, Farber RA, Petes TD (2007) Ninety-six haploid yeast strains with individual disruptions of open reading frames between YOR097C and YOR192C, constructed for the Saccharomyces genome deletion project, have an additional mutation in the mismatch repair gene MSH3. Genetics 177(3):1951–1953. 30. Game JC, et al. (2003) Use of a genome-wide approach to identify new genes that control resistance of Saccharomyces cerevisiae to ionizing radiation. Radiat Res 160 (1):14–24. 31. Bandyopadhyay S, et al. (2010) Rewiring of genetic networks in response to DNA damage. Science 330(6009):1385–1389. 32. Myung K, Datta A, Chen C, Kolodner RD (2001) SGS1, the Saccharomyces cerevisiae homologue of BLM and WRN, suppresses genome instability and homeologous recombination. Nat Genet 27(1):113–116. 33. Chen C, Umezu K, Kolodner RD (1998) Chromosomal rearrangements occur in S. cerevisiae rfa1 mutator mutants due to mutagenic lesions processed by doublestrand-break repair. Mol Cell 2(1):9–22. 34. Chang M, et al. (2005) RMI1/NCE4, a suppressor of genome instability, encodes a member of the RecQ helicase/Topo III complex. EMBO J 24(11):2024–2033. 35. Lengsfeld BM, Rattray AJ, Bhaskara V, Ghirlando R, Paull TT (2007) Sae2 is an endonuclease that processes hairpin DNA cooperatively with the Mre11/Rad50/Xrs2 complex. Mol Cell 28(4):638–651. 36. Marmorstein R, Trievel RC (2009) Histone modifying enzymes: Structures, mechanisms, and specificities. Biochim Biophys Acta 1789(1):58–68.