Chapter 17

2 downloads 0 Views 297KB Size Report
The small ubiquitin-like modifiers (SUMOs) alter the function of cellular proteins by ... tion is possible with mass spectrometry (MS) by the identification of the ...
Chapter 17 Detection and Quantitation of SUMO Chains by Mass Spectrometry Ivan Matic and Ronald T. Hay Abstract The small ubiquitin-like modifiers (SUMOs) alter the function of cellular proteins by covalent attachment to lysine side-chains. SUMOs can target themselves for modification so generating SUMO polymers, the functions of which are beginning to be unraveled. The identification and quantitation of SUMO chains is essential for the functional investigation of SUMO polymerization. Classical techniques, such as site-directed mutagenesis and western blotting, are indirect and often inconclusive methods for the study of SUMO polymers. On the contrary, direct detection is possible with mass spectrometry (MS) by the identification of the SUMO–SUMO branched peptide remnant after proteolytic digestion. In this chapter, we describe a straightforward workflow that incorporates a modified database to efficiently detect SUMO polymers from simple and complex protein samples. In combination with stable isotope labeling by amino acids in cell culture (SILAC), this proteomic strategy allows accurate relative quantitation of SUMO polymers from different biological samples. Key words: SUMO, SUMO chains, Proteomics, Mass spectrometry, SILAC

1. Introduction The small ubiquitin-like modifiers (SUMOs) are members of the ubiquitin family of posttranslational modifiers, and are increasingly becoming recognized as central players in many signaling pathways. SUMO conjugation targets lysine residues which are often found within the consensus motif yKxD/E, where y is a bulky aliphatic amino acid, K the target lysine, x any residue and D and E aspartate and glutamate, respectively (1). Similarly to ubiquitin, whose all seven internal lysines can be ubiquitinated (2), SUMO can also polymerize. Both SUMO-2 and -3 contain a consensus sumoylation site that is required for the formation of polymers (3). The third mammalian isoform, SUMO-1, cannot form chains R. Jürgen Dohmen and Martin Scheffner (eds.), Ubiquitin Family Modifiers and the Proteasome: Reviews and Protocols, Methods in Molecular Biology, vol. 832, DOI 10.1007/978-1-61779-474-2_17, © Springer Science+Business Media, LLC 2012

239

240

I. Matic and R.T. Hay

alone, but can modify a SUMO-2/3 chain, perhaps acting as a “chain terminator” (4). From a functional point of view, SUMO chains are of considerable and increasing biological interest although many aspects remain unexplored (5). The most characterized role of SUMO polymers is the recruitment of the ubiquitin E3 ligase RNF4 (ringfinger protein 4) to polysumoylated substrates, which leads to their degradation by the proteasome system (6). Much research has focused on the identification of the cellular targets of sumoylation. However, one of the main technical challenges to SUMO substrate discovery is the low abundance of SUMO substrates and substoichiometric level of modification (7). Consequently, the efficient investigation of sumoylated proteins with in vivo approaches has required the development of protocols for specific and stringent purification of tagged SUMOs (8, 9). These approaches have been employed to efficiently detect protein sumoylation by western blotting and more recently successfully adapted for large-scale identification of targets of sumoylation by quantitative MS-based proteomics (9). Quantitative proteomics approaches not only allow straightforward discrimination between true SUMO substrates and purification contaminants (10), but also provide systems-wide profiling of sumoylation dynamics in response to different cellular stimuli (11). However, despite their success, enrichment techniques at the protein, rather than peptide level, have so far not provided effective in the identification of sumoylation sites. This is largely due to the small proportion of the total peptides that are derived from the SUMO-substrate branched conjugate, even after stringent purification of SUMO target proteins (11) (see Note 1). Among the few branched peptides, whose abundance is sufficient for a prompt MS detection, are those derived from SUMO polymers (4, 11, 12). When SUMO-2 conjugates are purified, branched peptides derived from SUMO-2/3 can be easily and reproducibly identified, while the detection of the SUMO2/3 peptide modified by SUMO-1 requires a targeted approach (4) or a two-step purification protocol (11). A further technical difficulty in the MS-based identification of sumoylation sites is represented by the large signature tag of SUMO that remains after tryptic digestion (19 and 32 amino acids, respectively, for SUMO-1 and SUMO-2/3).The resulting large branched peptides as well as other cross-linked peptides, produce complex fragmentation spectra that cannot be interpreted by standard database search engines. This technical challenge can be overcome by specialized software (13), mutational strategies (14, 15), or by the construction of a database of linearized branched peptides. The concept of linearization of SUMO branched peptides for a straightforward interpretation of fragmentation spectra (4) combined with the modified-database strategy for the detection of cross-linked peptides in studies of protein complexes (16) has been employed to identify SUMO acceptor sites through a Web-based tool (17).

17

Detection and Quantitation of SUMO Chains by Mass Spectrometry

241

However, as described in this protocol, if the goal is the detection and quantification of SUMO polymerization, this approach can be considerably simplified. This method does not require any techniques or software other than the ones commonly used for the qualitative or quantitative proteomics investigation of sumoylated proteins. In combination with high-resolution, high-accuracy MS, this strategy has been employed to detect and quantify SUMO chains in SUMO conjugates purified from cells (11, 12) and in vitro sumoylation assays (18).

2. Materials 2.1. Peptide Identification and Quantitation

1. Mascot search engine (Matrix Science, London, UK). 2. Quantitative proteomics software package MaxQuant (19, 20) (see Note 2) available through http://www.biochem.mpg.de/ en/rd/maxquant/. For the installation of MaxQuant and hardware and software requirements, see ref. 20. 3. Human protein sequence database. The latest release of the human International Protein Index database (IPI) database (21) can be freely downloaded from ftp://ftp.ebi.ac.uk/pub/ databases/IPI/current/. The work-flow described here uses the IPI database, although the protocol can be adapted to alternative databases, such as Uniprot (22) or ENSEMBL (23). 4. A word processing program, such as UltraEdit (http://www. ultraedit.com/), Microsoft NotePad or Micosoft WordPad.

3. Methods The protocol described here is aimed at detecting SUMO chains by MS with a user-friendly strategy that can be easily implemented in any standard database search engine and quantitative data processing software. The same approach can also be applied to quantify SUMO polymerization by SILAC-based proteomics and the principle can be adapted to detect other known sumoylation sites from target substrates. It is based on the idea that a branched peptide has the same mass of a virtual peptide produced by fusing the C terminus of the modifying branch to the N terminus of the substrate peptide (Fig. 1a). The sequence of this linearized version of the branched peptide can then be added to protein sequences of an IPI database and analyzed by a standard search engine (24). The method presented here relies on samples derived from SUMO purification systems, whose step-by-step procedure is described in refs. 8, 9, and metabolic labeling with SILAC (25).

242

I. Matic and R.T. Hay

Fig. 1. Identification of SUMO–SUMO branched peptides via a modified database. (a) Linearization of cross-linked peptides. The virtual peptide consists of the C terminus of the modifying peptide joined to the N terminus of the modified substrate peptide. (b) Principle of appending linearized branched peptides to the sequences of SUMO. Optionally, depending on the purification strategy, the sequence of a tag can be added N-terminally to one of the SUMOs. (c) Modified entries of SUMO-2 and SUMO-3 in a protein sequence database. The letter J, which separates the sequence of SUMO from the sequences of linearized branched peptides, is in bold.

More specifically, a comprehensive protocol for the identification of SUMO target proteins by SILAC-based proteomics that describes in detail SILAC labeling, purification of SUMO targets, and MS analysis can be found in ref. 10. Therefore, the protocol described here focuses on the construction of the database and consequent data analysis.

17

3.1. Construction of the Database

Detection and Quantitation of SUMO Chains by Mass Spectrometry

243

1. Obtain the sequences of SUMO-2 and SUMO-3 from Uniprot (http://www.uniprot.org/uniprot/P61956 for SUMO-2 and http://www.uniprot.org/uniprot/P55854 for SUMO-3). 2. Create a file with extension “.fasta” (see Note 3). 3. Open the file with a word processing program. 4. Copy the two sequences into the newly created file. 5. Create a header in accordance to the parsing rule of the chosen human database (see Subheading 2 and Note 4). 6. Linearize all the SUMO branched peptides derived from SUMO polymers (see Note 5) as shown in Fig. 1b. 7. Append the SUMO-2/3-SUMO-2 and SUMO-1-SUMO-2 (see Note 6) linearized branched peptides to the C terminus of SUMO-2 and SUMO-2/3-SUMO-3 and SUMO-1-SUMO-3 branched peptides to the C terminus of SUMO-3 using J as separator (Fig. 1b, c) (see Notes 7 and 8). 8. Open the human IPI database and save it as a new file by giving it a short, descriptive name. 9. Use the program Sequence Reverser, included in the MaxQuant suite of software, to reverse each entry and add contaminants. 10. Set up the resulting concatenated target-decoy database in Mascot like any other IPI database as described in the Mascot manual. 11. Keep a copy of exactly the same database on the computer where MaxQuant is running.

3.2. Defining a New Rule for Enzyme Specificity

1. To make use of the code letter J, create a new enzyme definition with the name TrypsinMSIPI or similar in Mascot configuration. 2. Modify the Mascot enzyme configuration through a Webinterface. 3. Define the trypsin cleavage specificity to a C-terminal arginine or lysine residue. 4. To remove the letter J from the peptides’ sequences after the in silico digestion, add this letter to the enzyme definition and allow a cut both C- and N-terminally of J (see http://www.matrixscience.com/help/seq_db_setup_MSIPI.html for details).

3.3. Quantitative Data Processing with MaxQuant

1. Run Xcalibur raw files with MaxQuant and Mascot (see ref. 20 for a detailed description of use of the MaxQuant suite of programs). 2. Select the type of SILAC experiment (“Doublets” if the experiment was performed with double SILAC labeling and “Triplets” in case of a triple SILAC experiment) and then select the labeled amino acids.

244

I. Matic and R.T. Hay

3. Use the default parameters in Quant.exe and Identify.exe. 4. In Identify.exe set the protein false discovery rate (FDR) to 1% (0.01), which is the default value. 3.4. Identification of SUMO Branched Peptides

1. Start the program Viewer.exe, which is located in the MaxQuant folder on the local computer. 2. Load the Xcalibur raw files by choosing File > Load. It is not necessary to upload any other file. 3. Go to the “Identifications” tab and choose the sub-tab “Protein Groups”. 4. To find the modified SUMO sequence entries sort the “Protein IDs” column by alphabetical order by clicking once on the column header. Note that the ID reported is the string between “>IPI:” and the first vertical bar “|” in the sequence header: “SUMO2” and “SUMO3” (see Fig. 1). 5. To select the peptides associated to the SUMO sequences right-click the entry and then click “Show Peptides”. 6. To view the peptides, go to the “Identifications” tab. SUMObranched peptides will be reported if they have been identified. Heavy/light ratios, in case of a double SILAC labeling, and heavy/light and medium/light, if triple labeling was employed, are automatically shown.

4. Notes 1. Two mutational approaches have efficiently overcome this limitation by selectively enriching SUMO-modified peptides rather than proteins (14, 15). Despite their success in identifying a number of sumoylation sites, they are arguably less biologically representative owing to their reliance upon highly mutated forms of SUMO. 2. The strategy presented here is not limited to a specific database or quantitation software and can be implemented in any search engine and software capable of quantifying SILAC data. It is important to note that MaxQuant currently supports only files produced by Thermo LTQ-FT-ICR and LTQ-Orbitrap instruments. 3. FASTA format is a simple text-based format, in which each entry consists of a single-line header and lines of sequence data. The FASTA header line is distinguished from the sequence data by the greater than (“>”) symbol and gives a unique accession string or identifier for the sequence. Proteomics databases are provided in FASTA format and protein sequences are represented in the standard one-letter amino acid code.

17

Detection and Quantitation of SUMO Chains by Mass Spectrometry

245

4. In case of a human IPI human database the suggested headers for SUMO-2 and SUMO-3 are: “>IPI:SUMO2|addition of K11 branched peptides modified by SUMO2 and SUMO1|XX|SUMO2.” and “>IPI:SUMO3|addition of K11 branched peptides modified by SUMO2 and SUMO1|XX|SUMO3.” 5. While in most cases, fragmentation spectra of SUMO branched peptides possess multiple high intensity fragment ions from the long C terminus of SUMO, usually they contain only a few, predominantly low intensity ions derived from the substrate branch of the cross-linked peptide. Although in general this technical difficulty has hampered the identification of sumoylated peptide sequences, the existence of SUMO polymerization has been established beyond reasonable doubt. First, the identification of the branched peptides derived from SUMO polymers has been achieved by high resolution MS/MS analysis of in vitro SUMO polymerization, a very simple peptide mixture containing branched peptides derived exclusively from SUMO chains (4, 26). Second, SUMO polymerization sites have been confidently detected in cells by higher energy collision dissociation (HCD) fragmentation of peptides derived from enriched SUMO conjugates (27). To overcome these concerns, manually evaluate the peptide fragmentation spectra of SUMO branched peptides and compare the assigned spectra to the published ones (4, 26, 27). 6. SUMO-2 and SUMO-3 differ in just three amino acid residues localized on the N-terminal arm and share the same C terminus. Therefore, after digestion, the two paralogs leave the same adduct, and it is not possible to determine which of them is modifying a substrate peptide. 7. To prevent the false identification of nonexisting peptides, the code letter J is used as peptide-separator to delimit the branched peptides from each other and from the full-length protein sequence (24). It does not represent any amino acid, and it is theoretically possible to set a mass of J. However, setting the mass of J to a value greater than 0, especially if the value corresponds to a mass of an amino acid, could lead to the matching of spectra to spurious peptide sequences leading to an increase of the number of false positives. Therefore, it is advised to leave the mass of J at zero as in default Mascot configuration. 8. Some database search engines may not support the use of the code letter J. In these cases, peptides can be separated by normal tryptic cleavages. The very C-terminal peptide of SUMO does not have a lysine or arginine as its last residue. Therefore, simply appending a linearized branched peptide to the C-terminal of the SUMO would prevent a search engine from identifying both these peptides as they would be considered as one peptide. This problem can be circumvented by inserting

246

I. Matic and R.T. Hay

the cross-linked peptides between the last two C-terminal tryptic peptides of the SUMO. In this way, the last peptide of the original sequence would be still at the very C terminus and the last arginine or lysine of the protein, which originally separated the last two peptides, would be followed by linearized branched peptides. By setting the enzyme specificity to trypsin, the in silico digestion would produce the same peptides as in the letter J-based approach.

Acknowledgments The authors would like to thank Michael H. Tatham for comments on the manuscript. IM is a Sir Henry Wellcome Postdoctoral Fellow. References 1. Rodriguez MS, Dargemont C, Hay RT (2001) SUMO-1 conjugation in vivo requires both a consensus modification motif and nuclear targeting. J Biol Chem 276:12654–12659. 2. Pickart CM, Fushman D (2004) Polyubiquitin chains: polymeric protein signals. Curr Opin Chem Biol 8:610–616. 3. Tatham MH, Jaffray E, Vaughan OA et al (2001) Polymeric chains of SUMO-2 and SUMO-3 are conjugated to protein substrates by SAE1/SAE2 and Ubc9. J Biol Chem 276:35368–35374. 4. Matic I, van Hagen M, Schimmel J et al (2008) In vivo identification of human small ubiquitinlike modifier polymerization sites by high accuracy mass spectrometry and an in vitro to in vivo strategy. Mol Cell Proteomics 7:132–144. 5. Ulrich HD (2008) The fast-growing business of SUMO chains. Mol Cell 32:301–305. 6. Tatham MH, Geoffroy MC, Shen L et al (2008) RNF4 is a poly-SUMO-specific E3 ubiquitin ligase required for arsenic-induced PML degradation. Nat Cell Biol 10:538–546. 7. Hay RT (2005) SUMO: a history of modification. Mol Cell 18:1–12. 8. Tatham MH, Rodriguez MS, Xirodimas DP et al (2009) Detection of protein SUMOylation in vivo. Nat Protoc 4:1363–1371. 9. Golebiowski F, Tatham MH, Nakamura A et al High-stringency tandem affinity purification of proteins conjugated to ubiquitin-like moieties. Nat Protoc 5:873–882. 10. Andersen JS, Matic I, Vertegaal ACO (2009) Identification of SUMO target proteins by

11.

12.

13.

14.

15.

16.

17.

18.

quantitative proteomics. Methods Mol Biol 497:19–31. Golebiowski F, Matic I, Tatham MH et al (2009) System-wide changes to SUMO modifications in response to heat shock. Sci Signal 2:ra24. Schimmel J, Larsen KM, Matic I et al (2008) The ubiquitin-proteasome system is a key component of the SUMO-2/3 cycle. Mol Cell Proteomics 7:2107–2122. Pedrioli PG, Raught B, Zhang XD et al (2006) Automated identification of SUMOylation sites using mass spectrometry and SUMmOn pattern recognition software. Nat Methods 3:533–539. Blomster HA, Imanishi SY, Siimes J et al In vivo identification of sumoylation sites by a signature tag and cysteine-targeted affinity purification. J Biol Chem 285:19324–19329. Matic I, Schimmel J, Hendriks IA et al Sitespecific identification of SUMO-2 targets in cells reveals an inverted SUMOylation motif and a hydrophobic cluster SUMOylation motif. Mol Cell 39:641–652. Maiolica A, Cittaro D, Borsotti D et al (2007) Structural analysis of multiprotein complexes by cross-linking, mass spectrometry, and database searching. Mol Cell Proteomics 6:2200–2211. Hsiao HH, Meulmeester E, Frank BT et al (2009) “ChopNSpice,“a mass spectrometric approach that allows identification of endogenous small ubiquitin-like modifier-conjugated peptides. Mol Cell Proteomics 8:2664–2675. Castillo-Lluva S, Tatham MH, Jones RC et al SUMOylation of the GTPase Rac1 is required

17

19.

20.

21.

22. 23.

Detection and Quantitation of SUMO Chains by Mass Spectrometry

for optimal cell migration. Nat Cell Biol 12:1078–1085. Cox J, Mann M (2008) MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteomewide protein quantification. Nat Biotechnol 26:1367–1372. Cox J, Matic I, Hilger M et al (2009) A practical guide to the MaxQuant computational platform for SILAC-based quantitative proteomics. Nat Protoc 4:698–705. Kersey PJ, Duarte J, Williams A et al (2004) The International Protein Index: an integrated database for proteomics experiments. Proteomics 4:1985–1988. (2008) The universal protein resource (UniProt). Nucleic Acids Res 36:D190-195. Flicek P, Aken BL, Beal K et al (2008) Ensembl 2008. Nucleic Acids Res 36:D707–714.

247

24. Schandorff S, Olsen JV, Bunkenborg J et al (2007) A mass spectrometry-friendly database for cSNP identification. Nat Methods 4:465–466. 25. Ong SE, Mann M (2006) A practical recipe for stable isotope labeling by amino acids in cell culture (SILAC). Nat Protoc 1:2650–2660. 26. Cooper HJ, Tatham MH, Jaffray E et al (2005) Fourier transform ion cyclotron resonance mass spectrometry for the analysis of small ubiquitin-like modifier (SUMO) modification: identification of lysines in RanBP2 and SUMO targeted for modification during the E3 autoSUMOylation reaction. Anal Chem 77: 6310–6319. 27. Waanders LF, Almeida R, Prosser S et al (2008) A novel chromatographic method allows online reanalysis of the proteome. Mol Cell Proteomics 7:1452–1459.