Application of Proteomics in Cardiovascular Research - IngentaConnect

1 downloads 0 Views 157KB Size Report
COEUR Cardiovascular Research School, Erasmus Medical Centre, Dept. of Biochemistry, PO Box 3000 CA. Rotterdam, Netherlands. Cardiovascular Research ...
108

Current Proteomics, 2010, 7, 108-115

Application of Proteomics in Cardiovascular Research Dick H.W. Dekkers, Karel Bezstarosti, Diederik Kuster, Adrie J.M. Verhoeven and Dipak K. Das* COEUR Cardiovascular Research School, Erasmus Medical Centre, Dept. of Biochemistry, PO Box 3000 CA Rotterdam, Netherlands Cardiovascular Research Center, University of Connecticut School of Medicine, Farmington, CT 06030-1110, USA Abstract: This review focuses on the current status of proteomic techniques that can be specifically applied to heart. Proteomics allows us to study alterations in protein expression in diseased hearts and leads us to develop new diagnostics and therapeutic parameters. The availability of the high resolution capacity of 2-DE can be successfully used to separate proteins in the first dimension according to their charge (isoelectric point) under denaturing conditions followed by their separation according to their molecular mass by SDS PAGE. The separated proteins are then visualized at high sensitivity with SYPRO dyes, especially SYPRO Ruby which is the most appropriate post-electrophoretic stain because of its compatibility for subsequent MS analysis. After the generation of a large protein dataset, they are organized using bioinformatics. Even though proteomics techniques have undergone substantial improvement, it remains a problem to identify phosphorylated proteins, which may be used for early disease detection. The proteomics analysis discussed in this review can be used for drug discovery, development of therapeutic modalities for cardiovascular diseases and the design of clinical trials. Proteins play more dynamic roles compared to DNA and RNA since most biological functions are regulated by protein-protein interactions. Protein-protein interaction mapping is crucial for many degenerative diseases and proteomics play an important role in understanding the molecular mechanisms of cellular functions. Though advancements in equipmentation have been made, it is unlikely to gain although MS is a powerful and evolving technique, the cost of running a sample needs to be considered. For example, regarding the cost of labeling, iTRAC runs about $400/sample and as many as 30 biological samples may be required to reach statistical significance in patient samples. Extensive time is also needed on a MS machine to run a fractionated sample on the order of days (times the number of samples). Once large datasets are generated, a bioinformaticist is required to align and analyze data from multiple treatment groups. An additional limitation is that the protein and splice variants have to be characterized to be identified by search engines. A number of predicted proteins may be identified with limited commercial resources available to follow up on such targets. Finally, though there have been advances in mass spectrometry equipment such as the Fouriertransform ion cyclotron resonance MS that generate higher sensitivity and dynamic range, there is a lack of standardization of protocols from sample collection and processing along the pipeline to data analysis. Unlike genomic data there is no community standard for database sharing. Although there are limitations to the technique, proteomics is likely to have great impact on drug discovery and clinical trial design leading to the development of niche personalized medicine. There is a definite need for early disease detection with appropriate biomarkers and proteomics are the tool to fulfill the requirement. For example, a routine, specific and sensitive serum proteomic pattern for cardiovascular diseases would be useful to clinicians for the early detection of diseases. In this regard, a low-resolution SELDI-TOF proteomic profile could be extremely useful. Compared to mRNAs, proteins are subjected to posttranslational modifications like phosphorylation, glycosylation and cleavage, and thus genomics are likely to miss the correct targets. This is of utmost importance for disease-related proteomics to become an essential component of personalized medicine system, which has great promise for the improvement of disease evaluation and patient care.

Keywords: Proteomics, heart, cardiovascular, protein expression, phosphorylated proteins. INTRODUCTION Most degenerative diseases including myocardial insufficiency appear to be due to alterations in gene and protein expression. Genomics often lead to an incomplete picture, as many genes may not be converted into proteins, proteomics are able to provide some idea of the protein profiles, which *Address correspondence to this authors at the Cardiovascular Research Center, University of Connecticut School of Medicine, Farmington, CT 06030-1110, USA; Tel: (860) 679-3687; Fax: (860) 679-4606; E-mail: [email protected] 1570-1646/10 $55.00+.00

are either upregulated or downregulated and how such alteration in the protein profiles affects the disease process. Mammalian heart contains several hundreds of proteins while the mitochondria contain about fifteen hundred proteins. Proteomics can give a clear picture of the changes of the protein profiles in a pathologic heart. In the post-genomic era, proteomics has become a major focus of research. The human genome contains about 23000 transcriptional units (genes), and is virtually constant in all cells of an organism under all conditions. The proteome is the total complement of proteins in an organism, and is more ©2010 Bentham Science Publishers Ltd.

Proteomics in Cardiovascular Research

relevant as phenotypic differences are the result of variation in sorts, abundance and post-translational modifications (PTMs) of expressed proteins. Compared to the genome, the proteome is far more complex due to alternative splicing and extensive post-translational modifications. When taking into accountthe splice variants and precursor/cleaved forms of each gene/protein, the proteome actually contains about 500,000 proteins, and when one considers the numerous post-translationally modified states of each protein, the number reaches into millions. Expression of proteins differs greatly between different cells and organs, and depends among others on differentiation state, and environmental and internal conditions. The proteome is highly dynamic, with transient changes in PTMs occurring at the second-to-minute time scale. Hence, the proteome at best reflects the complement of proteins expressed in a certain cell/tissue under specified conditions. The last decade, proteomic research has taken an enormous flight, not in the least because of the advent of affordable mass spectrometers (MS) and the development of powerful MS techniques suitable for peptide analysis. Ultimately, most proteomic research is directed towards protein identification by MS [1]. In the early days, protein identification was the major goal, but this has now shifted towards quantification of changes in expression levels, mapping of PTMs, and analysis of protein-protein interactions. One of the major challenges in proteomic research is the detection of low abundant proteins. Expression levels of proteins in tissues, cells or body fluids may vary by 8 orders of magnitude, and low abundant proteins are therefore easily missed. This point is often underappreciated by those outside the field. Although zettamole is the detection limit, this is

Current Proteomics, 2010, Vol. 7, No. 2

109

only reached in a pure solution and not in complex lysates [2]. The complexity of the analyte must therefore be reduced, either by selection of specific cell types (microdissection, primary cultures), subcellular cell fractionation, or by classical biochemical separation of proteins and protein assemblies (column fractionations; immuneprecipitations), before proteins are analyzed by MS. A comprehensive overview and flow chart of proteomic techniques is shown in Fig. (1). In the following, we focus on the application of proteomic techniques to heart tissue, and outline some future developments. Application of Proteomics in Cardiovascular Diseases The identification of proteins that undergo modifications is essential for the proper evaluation of cardiovascular diseases as most proteins undergo rapid changes with the changes of contractile apparatus. The knowledge of the altered proteins allows us to characterize the molecular and cellular mechanisms of the cardiovascular diseases. Diagnosis of heart failure at the early stage depends on the detection early markers, which are likely to be identified by proteomics. Theystemic nature of cardiovascular system further implies the need for the evaluation of proteins that changes during the early stages of heart failure. A large number of proteins are known to undergo rapid changes during cardiac complications. For example, arterial fibrillation, the most common causes of cardiac arrhythmias, causes changes in heat shock proteins (HSP) (Shapiro and Klein, 1968). Myocardial ischemia and reperfusion, cardiomyopathy and heart failure alter many proteins including protein kinases and tyrosine kinases [3]. A recent report

Fig. (1). Flow chart of proteomic analysis. The primary goals of proteomics are identification of proteins and their post-translational modifications (PTMs), and quantification of differences in their expression under different external conditions. Protein identification including PTMs is ultimately made by MS. Various methods have been developed for quantification of proteins. Most of the quantification methods yield relative values (shaded boxes); absolute quantification for selected peptides is done by spiking protein or peptide mixtures with a known amount of appropriately tagged peptide (dark box). In order to reduce complexity of the analyte, the proteins in the sample are subdivided in different fractions before proteins are digested into peptide fragments. Alternatively, crude protein fractions are first digested into peptide fragments, which are then fractionated by multidimensional LC (MudPIT) or sequential LC (Cofradic).

110 Current Proteomics, 2010, Vol. 7, No. 2

evaluated tissue-specific remodeling of the mitochondrial proteome in type 1 diabetic Akita mice [4]. Another recent study compared the aged left ventricle proteome to that of left ventricles obtained from other models of disease and heart failure [5]. Unfortunately, clinically useful blood protein markers for the early detection of heart diseases are yet to be identified. At present, there is no other alternative but to analyze the proteins carefully dissected from the heat muscle. We have summarized the commonly used methods for proteomics analysis of the heart. Sample Preparation, Reducing Complexity Whole hearts are very heterogeneous. Heart tissue can be dissected into different macroscopical regions (left and right ventricle and atria, septum), or in near-endocardium and near-epicardium regions of left ventricle tissue, that are likely to differ in protein expression and sensitivity to mechanical, electrical and pharmacological stress [6]. In addition, we have successfully dissected arterioles from bigger laboratory animals for the analysis of selective protein expressions (reference?). We have also performed proteomic analysis of primary cultures of cardiomyocytes [7] and heartderived fibroblasts, obtained from neonatal rat hearts. Kislinger et al. [8, 9] describe a method for subcellular fractionation, followed by proteomic characterization for freshly excised mouse hearts. Using rigorous statistical filtering and machine-learning methods, the subcellular localization of 3274 of the 4768 proteins identified was determined with high confidence, including 1503 previously uncharacterized factors, while tissue selectivity was evaluated by comparison to previously reported mRNA expression patterns [9. In our hands, this protocol failed for pig hearts, illustrating that techniques have to be optimized for each species. Homogenization of heart tissue is notoriously difficult, mainly because of the high myofilament content and extracellular matrix strength. In our hands, pulverization of muscle biopsies with a microdismembrator at liquid nitrogen temperature gives the best results [10]. This has the obvious advantage of enabling the use of stored tissue samples derived from human surgery or from an experimental tissue bank, but subcellular fractionation is less meaningful because of the disruption of intra- and extracellular membranes. From these homogenates, we prepare a cytoplasmic and a myofilamental fraction by sonication, using 1% (final) Triton-X100, which precipitate the filaments [10, 11]. Recently, we devised a method to prepare relatively pure nuclear extracts from pulverized frozen heart tissue, using a Dounce homogenizer (unpublished observation). The strongest reduction in sample complexity is probably achieved by capture of selective proteins, for example in pull-downs with specific antibodies. Under mild conditions, this procedure not only captures the protein of interest itself, but also the associated proteins. This method is therefore well suited for analysis of protein-protein interactions and protein interaction networks [12]. This approach is very powerful when using either tagged proteins overexpressed in cells or transgenic animals as bait. As over-expressed proteins may behave differently from endogenous proteins, we prefer capturing of target proteins with specific antibodies. In their landmark study Ping et al. [13] used antibodies

Dekkers et al.

against PKC- to identify proteins involved in signal transduction via this protein kinase in mouse hearts. We find that the abundance of blood poeins (or a few notoriously abundant proteins in the blood) is problematic even after extensive washing of cardiovascular tissue. Sigma, a division of Sigma-Aldrich Corp. (Nasdaq: SIAL), has recently introduced the new ProteoPrep® 20 Plasma Immunodepletion Kit. The patent pending ProteoPrep 20 Kit removes 99% of twenty high abundant plasma proteins using a proprietary multivalent antibody affinity media in a convenient spin column format resulting in removal of 97% of the overall protein mass. The unique high-density antibody affinity media enhances the ability to visualize low abundance proteins in the plasma proteome using standard analysis technology. Additionally, the multivalent antibody resin utilizes novel recombinant antibodies that allow for a significantly higher density of conjugation and hence, efficiency of depletion with minimal sample loss or dilution. Protein Separation Prior to MS Identification Prior to identification by MS, proteins are separated either by electrophoresis in polyacrylamide (PAGE) or by liquid chromatography (LC). Relatively simple protein mixtures, such as obtained after pull-down with specific antibodies, are usually separated on the basis of size only by classical denaturing 1D-SDS PAGE [12]. The proteins in the gel are subsequently stained with Coomassie, and the stained bands are cut out from the wet gel. Alternatively, the entire lane is cut into 50-60 slices. After destaining the gel slices, the reduced cysteine residues are alkylated with iodoacetate or iodoacetamide, and the proteins are digested in-gel with a site-specific protease, usually trypsin which C-terminal of lysine and arginine residue. After elution, the peptides are eluted from the gel for analysis by MS. 2D-gelelectrophoresis (2DE) using large gels (24x24 cm) can resolve up to 10,000 proteins. Proteins are separated on the basis of their pI by isoelectric focusing (IEF), followed by size-separation on SDS-PAGE in the first and second dimension, respectively [7, 10, 11]. After denaturation of the proteins with urea and non-ionic or zwitterionic detergents, IEF is performed on gelstrips with immobilized pH gradients (either broad-range (e.g., 3-10) or narrow-range (e.g., 5-6)). Next, the detergents in the gel-strip are exchanged for SDS, the proteins are reduced with DTT, and the cysteine residues are alkylated. The gelstrip is then loaded on top of a SDS polyacrylamide slab gel, and the proteins are size-separated by SDS-PAGE. After visualization of the protein spots by Coomassie, silver or Sypro Ruby staining, the spots of interest are picked from the gel. Alternatively, spots of interest may be selected on the basis of differential display (see below), the presence of phosphorylation or glycosylation judged by Pro Q fluorescent staining, or by immunoblotting for the presence of specific proteins or protein modifications. These gel-based protein separation techniques are very laborious, and are not suited for high-throughput applications. Instead, several gel-free protein separation protocols have been developed based on high-performance liquid chromatography (LC). Proteins are separated on the basis of a single (1D-LC) or two properties (2D-LC), using reversephase (hydrophobicity), ion-exchange or chromatofocussing

Proteomics in Cardiovascular Research

(charge, pI), or size exclusion (size) columns. The proteins in each fraction are digested with trypsin andthe residues are then used for MS analysis. Gel-free techniques will eventually replace the gel-based separation techniques in highthroughput proteomics. While gel-based methods are relatively simple to use, LC is used for the separation of a complex mixture of proteins. LC separation is based on the inherent characteristics of a protein including its mass, isoelectric point, hydrophobicity and biospecificity. Separation of proteins is achieved by a combination of columns with different separation modes such as anion exchange, cation exchange, reversed phase size exclusion and affinity. LC-MS is rapidly becoming a useful alternative to 2D gel electrophoresis in proteomics. Differential Protein Profiling by 2DE Functional proteomics focuses among others on proteins whose expression, or PTM, is affected by development, disease, hormonal or pharmological intervention, and differentially displayed proteins only are selected for MS identification. This can be performed by 2DE, in combination with Coomassie staining. Parallel 2D gels are generated for control and treated samples, and spots whose staining intensity is affected are selected for further processing. Commercial software packages (e.g. PDQuest, Bio-Rad, CA USA) are available for the alignment of the different 2D gels, and normalization for total staining intensity. This analysis gives for each spot in the gel the staining intensity and the fold change in spot intensity due to the treatment. Of each sample, at least five gels must be run in parallel to compensate for gel-to-gel variability and to allow for statistical analysis [7, 10, 11, 14]. Differential profiling of proteins by 2DE is greatly improved by DIGE (Differential imaging of gel electrophoresis), in which the proteins are fluorescently labeled [15]. The proteins in the treated and control sample are covalently modified with either a Cy3 or Cy5 fluorophore, the two samples are mixed and then separated by conventional 2DE [16]. Differences in spot intensity are detected by fluorescence scanning. Mixing the test samples with a Cy2-labelled sample pool before 2DE further reduces the effect of gel-togel variation. Matching of parallel gels and quantification of spot intensities on multiple gels is performed semiautomatically with special software (e.g., DeCyder from GE Healthcare). The different CyDyes are matched for charge and mass, thus ensuring co-migration of proteins labelled with either fluorophore in 2DE. In the “minimal labelling” protocol, on average one lysine residue is labelled per protein, thereby minimally affecting its solubility. With “saturation labelling”, virtually all cysteines are labelled, thereby improving the detection of (cysteine-containing) proteins. The presence of one or more fluorophores on protein may affect its electrophoretic mobility in 2DE, especially of smaller proteins. When spots from DIGE have to be picked for downstream MS analysis, the proteins can be mixed with excess of unlabelled sample pool prior to 2DE, followed by post-staining with Sypro Ruby for exact spot excision [16]. Many spots on 2DE gels contain more than one protein, which makes it difficult to assign the differential display to a single protein. In addition, spots identified on 2DE gels,

Current Proteomics, 2010, Vol. 7, No. 2

111

mostly correspond to relatively abundant proteins. The 2DE approach therefore appears to be less suited for global proteomics. However, this also holds for other global proteome techniques. Eventually, these methods have to be integrated with techniques targeted at analysis of specific proteins or protein classes. Mass Spectrometry of Peptide Mixtures Different instrumental setups are used for the mass spectrometric analysis of the tryptic digests. A combination of a MALDI-Tof MS with a nanoLC-ESI-QTOF MS/MS and a nanoLC-ESI Orbitrap MS/MS can generate similar data but differ in the principle of data collection [17]. MS analysis of peptide mixtures occurs in two stages (tandem MS or MS/MS). First, the mass of the individual peptides in the mixture is determined (MS1), and secondly, peptides are selected on-line for fragmentation and mass analysis of the fragments in a second coupled MS (MS2), thereby generating amino acid sequence information. Peptide and fragment masses (actually mass/charge or m/z ratios) are determined either by time-of-flight (Tof), or by oscillation frequency in an Orbitrap. In order to gain mobility in an electromagnetic field, the peptides have to be brought into the gas phase in an ionized form without fragmentation. This is achieved either by matrix-assisted laser desorption/ionization (MALDI), or by electro-spray ionization (ESI). For MALDI, the peptide mixture is co-crystallized with the matrix alpha-cyano-4hydroxycinnamic acid on an anchor plate. After insertion of the dried plate into the MS, the co-crystals are irradiated with a laser beam pulse. This results mainly in the release of positively single-charged peptides. When run in positive mode, these ions are attracted to the cathode and subsequently guided into the m/z analyzer. With ESI, the peptides are first concentrated by a reverse-phase (RP) column equipped with a nanospray source (nanoLC). The peptides are eluted with acetic or formic acid in acetonitrile, and directly sprayed as aerosols into the ESI source. The peptides are protonized by the acid into mostly multi-charged ions. When the solvent is evaporated, the sole ions in the gas phase are attracted to the cathode and introduced into the m/z analyzer. Irrespective of the mode of ionization, a spectrum is collected over time (continuous mode), thus generating a catalogue of m/z values for the peptides in the mixture (peptide mass fingerprint PMF). In MS/MS, peptides are selected on the basis of their m/z value for sequence analysis by fragmentation (MS2). Of each MS1 spectrum, the three most intense ions are selected. This is done by a quadrupole operating in data-dependent mode, which automatically switches between MS1 and MS2 acquisition. The selected peptides are fragmented by collision with gas molecules (collision-induced dissociation, CID). This results in breakage mainly of the peptide bonds, thus generating a series of N-and C-terminal fragments (b and y-ions, respectively). These fragments are then guided into the m/z analyzer. The mass difference between consecutive b-ions, and between consecutive y-ions corresponds to the mass of the amino acid, and the amino acid sequence can thus be deduced from the spectrum. A sequence of 4-5 consecutive amino acids is usually sufficient to definitively identify a peptide in combination with its mass as belonging to a par-

112 Current Proteomics, 2010, Vol. 7, No. 2

ticular protein in the database. With this MS2 spectrum, also modified amino acid residues can be identified. The quadrupole can also be triggered by an additional property in the MS2 spectrum to select a peptide from the original peptide ion mixture. For example, during CID carboxylated and phosphorylated peptides tend to fragment by losing a CO2 or phosphoric acid group, which is equivalent to the neutral loss of 44 and 98 Da, respectively (for ions with charge z = 1). MS3 acquisition can be triggered when this neutral loss occurs between this ion in the MS1 and MS2 spectrum. In the MS3 spectrum sufficient data is then collected to determine unequivocally the site of the PTM [14]. The three instrumental setups differ in sensitivity, accuracy and throughput time. This is reflected in the different peptide mass tolerance settings used while searching the protein database for matches with the MS1 data (see below). Since multiple-charged ions are more readily fragmentated by CID, ESI is preferred for tandem MS/MS applications. ESI is ideally suited for analyzing protein mixtures separated by LC methods, whereas MALDI is usually applied after gel-based protein separation. Data Analysis Powerful software is available for the on-line or in-house analysis from Matrix Science, Boston, MA, USA (www.matrixscience.com). While there is no universally accepted protocol for data analysis, Sequest and Mascot are used by 80% of the community. Other freeware analysis programs such as XHunter are also available. However, the algorithms of each program are different and can produce different hit lists. Validation via more specific biochemical methods such as ELISA should be performed to determine false positive rates. Additionally, intra- and inter- sample variability and the number of samples (n) required for significance are also important factors to consider when analyzing the dataset. The Mascot program automatically creates peak lists from the raw MS1 and MS2 data files, and these peak values are searched against a non-redundant NCBI database consisting of the predicted masses of the in silico digested proteins encoded by the genome of the species under investigation. In silico digestion is done with the same enzyme as the experimental data, usually trypsin, allowing for one or two missed cleavages. Peptide mass tolerance is typically set at 100 ppm for Maldi-Tof, 2 Da for Q-Tof, and 10 ppm with the Orbitrap. For analysis of MS2 data, the fragment ion tolerance is set at 0.8 Da. At least 300 different amino acid modifications can be found in proteins, and in principle, each of these potential PTMs can be selected as fixed or variable modification in the in silico generated peptide mass data base. However, this would exponentially increase the required computing time, and substantially reduce the number of statistically significant hits. As the proteins were reduced and alkylated before separation, carbamidomethylated cysteine is set as fixed PTM. Oxidized MS spectra, notably Mascot from methionine and, depending on the particular research subject, one or two other PTMs (e.g., O-phosphorylation of serine, threonine and tyrosine) are set as variable PTM.

Dekkers et al.

In the above-mentioned study [Mascot], the software gives either a Mowse or Mascot score for each match, which is a measure for the significance of the match. The chance that this match is true depends on the length of the precursor protein, and the number and length of the fitting peptides observed in the MS spectrum. The Mowse value is a relative score assigned to each match taking these variables into account, and the cut-off is set at the Mowse score where p=0.05 that the match is random. However, the larger the number of potential proteins in the database, and the more number of variable PTMs that are included, the number of false positive hits will also increase. In the more recent literature, the Mowse score has therefore been replaced by the probability-based Mascot score, –10*10log (P), in which P is the absolute probability that the observed match is a random. The Mascot score cut-off value for a positive protein match is usually set at 40. Quantitative MS Several techniques have been developed for direct comparison of protein expression levels between two different samples using mass spectrometry. These methods invariably depend on the differential incorporation of a MS sensitive tag into the proteins of one of the two biological samples [18, 19]. When studying cells in culture, treated or untreated cells can be grown to equilibrium with tagged amino acids, such as 13C or 2D-aminoacids. This results in the incorporation of these “heavy” amino acids into proteins during their synthesis. After cell lysis, the “heavy” and “light” homogenates are mixed and then subjected to MS/MS analysis. The MS2 spectra show double peaks corresponding to light and heavy amino acid residues, and the ratio between both signals is a measure for the effect of treatment on this particular protein. This technique called SILAC (stable isotope labeled amino acids in cell culture) is very powerful, but generally not applicable to studies on whole animals or diseased tissues. Similar differential protein profiling by MS can also be performed by globally tagging proteins after tissue homogenization, similar to 2D-DIGE. In cICAT (cleavable isotopecoded amino acid tags), cysteine residues are modified with an alkylating reagent containing either 12C or nine 13C atoms, and a biotin affinity group for enrichment of ICAT-labeled peptides on immobilized avidin. After cleavage of the biotingroup, the ICAT-labelled peptides can be eluted and analyzed by MS/MS. The differently labeled peptides will show up in MS spectra, nine Da apart. In iTRAQ (isobaric tags for relative and absolute quantitation), free amines at the Nterminus and lysine residues are tagged with either one of four signature molecules. In MS1 these tags are indistinguishable, but upon collision-induced fragmentation, they each produce a specific telltale ion with m/z from 114 to 117. The relative peak areas of these four ions are a measure for their relative contribution in mixed samples. iTRAQ can thus be used for comparing 3-4 different protein samples in one MS/MS analysis. The iTRAQ method appears to be more sensitive than the cICAT method, whereas the latter is comparable with 2D-DIGE [20]. These post-isolation protein-labeling techniques are widely applicable, but are biased towards large and abundant proteins.

Proteomics in Cardiovascular Research

Differential labeling of peptides in two different protein pools can also be achieved after digestion with trypsin. In the presence of H218O, trypsin mediates 16O-to-18O exchange of the two C-terminal carboxyl O atoms, resulting in a 4 Da mass difference with the corresponding 16O peptides. The advantage of this 16O-to-18O exchange over amino acid tagging methods is that it labels all peptides in the tryptic digest (except for the protein’s most C-terminal fragment), and utilizes simple reagents. This technique will undoubtedly gain popularity in future quantitative proteomics. Absolute quantification of specific proteins or peptides, for example for the determination of the stoichiometry in protein complexes, can be done by spiking protein samples with a known amount of appropriately tagged peptides (AQUA or iTRAQ) before MS/MS analysis [1, 12]. The absolute quantification by LCMS depends on the relationship between MS signal response and protein concentration. With the use of an internal standard, such relationship is used to calculate the universal signal response factor, which generally remains the same for the proteins. Label-free method used in conjunction with LCMS is well-suited for the absolute quantification of proteins. Label-free techniques for quantification have been described by Steve Carr and others (Jacob et al, 2006). One strategy for doing so is to count the spectra identified for each sample by the search engine. This technique requires a statistical comparison of multiple, repeated MS2 runs of each sample. CPAS makes handling the data from multiple runs straightforward. Given the costs of labeling, this should be a least considered as a first pass method. Analysis of PTMs and PTM Sites Proteomics has a prime task in the analysis of PTMs and PTM sites in for example signal transduction or oxidative stress that cannot be replaced by genomics or transcriptomics. As already mentioned, more than 300 different PTMs have been shown in proteins, and it is unfeasible to globally look for all the PTMs simultaneously. Instead, one can look for only a few PTMs in the proteome at a time. Phosphorylated or glycosylated proteins can be selected for MS analysis after their localization on 2D gel blots by fluorescent Pro Q staining. Alternatively, phosphorylated or glycosylated peptides can be enriched from complex samples by capture on metaloxide or lectin columns, respectively [21]. Phosphorylation sites can be mapped indirectly in MS/MS by neutral loss of phosphoric acid upon CID, as described above, or by comparing spectra of peptide mixtures before and after alkaline phosphatase treatment [14]. Direct MS/MS based mapping of phosphorylation and other labile PTMs becomes feasible when ion fragmentation by CID is replaced by electron transfer dissociation, which improves stability of these PTMs and yields much better peptide sequences [17]. Some PTMs are derivatized into stable tags that can be used for their selective detection by MS. Protein carbonyl groups formed during oxidative stress are chemically derivatized into hydrazones, which enables their detection on Western blots or enrichment by capture columns using specific antibodies [18]. With clever chemistry, the different redox states of the cysteine in proteins can be discriminately labeled with different alkylating tags [18, 19]. These are just a few exam-

Current Proteomics, 2010, Vol. 7, No. 2

113

ples of the approaches followed in recent literature for analysis of specific PTMs. Further tagging and capturing methods have to be devised to expand the current list of PTMs that can be globally assessed. Solving whether or not PTMs are present simultaneously on distant peptide fragments of a single protein, such as phosphorylation of Ser15 and Ser86 in Hsp27 [10], will be the next challenge. Shotgun Proteomics Recently, high throughput proteomic studies have turned their focus to the analysis of complex peptide mixtures by HPLC-MS. Protein samples, either after pre-fractionation or as such, are digested with trypsin thereby generating a complex mixture of peptides. These peptide mixtures are then extensively fractionated by multi-step HPLC on the basis of charge and hydrophobicity, before ionization and injection into a tandem MS/MS. This multidimensional protein identification technology (MudPIT) has been successfully applied to subcellular fractions of mouse tissues, including the heart [8, 9]. To obtain a maximal number of identified proteins for each fraction, 4-5 repeated MudPIT runs were required. The authors claim that quantitative information can also be derived from the number of spectra recorded for each peptide across different samples. Further studies must be performed to prove its reliability and general applicability; if so, comparing spectral count numbers would represent a very simple relative quantification method. MudPIT is aimed at determining the total protein complement in a biological sample, but is also biased towards large and abundant proteins. The group of VandeKerckhove in Ghent (Belgium) has devised clever technology to separate the protein’s N-terminal peptide from the internal peptides in a tryptic digest [20]. The principle of this method, coined COFRADIC for combined fractional diagonal chromatography, is that peptides are selectively derivatized, thereby changing their hydrophobicity and hence, running behavior on RP columns [21]. Before tryptic digestion of biological samples, the proteins N-termini are blocked by acetylation. The tryptic digest is run over a RP column, and a number of fractions are collected. The peptides in each fraction are then derivatized at their free N-terminus with TNBS (trinitrobenzenesulfonic acid), and then each fraction is rerun on the same RP column. Only peptides with acetylated Ntermini are not derivatized with TNBS, and elute in the same fraction in both runs. This fraction is then selected for tandem MS/MS analysis, thereby eliminating a large number of peptides derived from internal cleavages. Thus, each protein in the biological sample is represented by only one peptide in the peptide mixture, thereby reducing its complexity, and partly alleviating the bias towards large and abundant proteins. In contrast to other high-throughput systems, the COFRADIC technology is particularly suited for global mapping of PTMs. In principle, each type of PTM that can be chemically or enzymatically derivatized into a peptide with altered hydrophobicity can be separated away from bulk peptides lacking this PTM. The resulting enrichment of peptides bearing this PTM greatly facilitates protein identification by tandem MS/MS. COFRADIC technology has already been applied successfully for global identification of O-

114 Current Proteomics, 2010, Vol. 7, No. 2

phosphorylated, N-glycosylated and tyrosine nitrosylated proteins [21]. Due to its versatility and flexibility, this technology is likely to be developed further for the global protein profiling of other PTMs as well. One of the challenges in protein identification is obtaining confident identifications (>40 mowse score); however, often in complex samples there are many proteins with lower scores that are very interesting biologically. Shotgun approach may be used for routine identification of proteins, which in concert with isotope labeling, relative quantification of thousands of peptides can be achieved from a single experiment [22]. SUMMARY AND CONCLUSION In this review, we have focused on the current status of proteomic techniques that can be applied to heart tissue, and outlined some future areas for development. Proteomics allows us to study alterations in protein expression in diseased hearts and leads us to develop new diagnostics and therapeutic parameters. The high resolution capacity of 2-DE is universally used to separate proteins in the first dimension according to their charge (isoelectric point) under denaturing conditions followed by their separation according to their molecular mass by SDS PAGE. The separated proteins are then visualized at high sensitivity with SYPRO dyes, especially SYPRO Ruby which is the most appropriate postelectrophoretic stain because of its compatibility for subsequent MS analysis. After the generation of a large protein dataset, they are organized using bioinformatics. The proteomics approaches are still undergoing modifications depending on the specific tissues, diseases and analyses being used. It remains a problem to identify phosphorylated proteinswhich may be used for early disease detection. Proteomics analysis needs to be properly developed for drug discovery, development of therapeutic modalities for cardiovascular diseases and the design of clinical trials. Proteomics data evaluation and interpretation also need improvement. Several investigators have suggested improved techniques for data evaluation. For example, there is definite need for improvement in handling bulk amount of data generated by LC-MS and solutions for data management [23]. A new database search engine, OLAV, has been proposed, which was compared against conventional search engine Sequest and Mascot [24]. Though advancements in equipmentation have been made, it is unlikely to gain information on the whole proteome even in optimized systems. In terms of generating a complete information set, genomics is much farther along than proteomics. Although reliable data does not exist to establish a mathematical model, some correlation data exists demonstrating a high degree of variation between the data obtained by genomics and proteomics. For example, in a recent study, direct comparison was made between mRNA expression and protein abundance data from 181 genes in yeast [25]. The two variables showed a high degree of variation for individual data pairs and investigators have come to different conclusions about the general correlation between them. Although MS is a powerful and evolving technique, the cost of running a sample needs to be considered. For exam-

Dekkers et al.

ple, regarding the cost of labeling, iTRAC runs about $400/sample and as many as 30 biological samples may be required to reach statistical significance in patient samples. Extensive time is also needed on a MS machine to run a fractionated sample on the order of days (times the number of samples). Once large datasets are generated, a bioinformaticist is required to align and analyze data from multiple treatment groups. An additional limitation is that the protein and splice variants have to be characterized to be identified by search engines. A number of predicted proteins may be identified with limited commercial resources available to follow up on such targets. Finally, though there have been advances in mass spectrometry equipment such as the Fourier-transform ion cyclotron resonance MS that generate higher sensitivity and dynamic range, there is a lack of standardization of protocols from sample collection and processing along the pipeline to data analysis. Unlike genomic data there is no community standard for database sharing. Although there are limitations to the technique, proteomics is likely to have great impact on drug discovery and clinical trial design leading to the development of niche personalized medicine. There is a definite need for early disease detection with appropriate biomarkers and proteomics are the tool to fulfill the requirement. For example, a routine, specific and sensitive serum proteomic pattern for cardiovascular diseases would be useful to clinicians for the early detection of diseases. In this regard, a low-resolution SELDI-TOF proteomic profile could be extremely useful. Compared to mRNAs, proteins are subjected to posttranslational modifications like phosphorylation, glycosylation and cleavage, and thus genomics are likely to miss the correct targets. This is of utmost importance for diseaserelated proteomics to become an essential component of personalized medicine system, which has great promise for the improvement of disease evaluation and patient care. Proteins play more dynamic roles compared to DNA and RNA since most biological functions are regulated by protein-protein interactions. Protein-protein interaction mapping is crucial for many degenerative diseases and proteomics play an important role in understanding the molecular mechanisms of cellular functions. For the benefits of the readers several useful web servers are provided, e.g., Cell-Ploc at http:/chou.med.harvard.edu/ bioinf/Cell-Ploc/ to identify the subcellular locations of proteins in various organisms; Euk-Ploc or Euk-mPloc to identify the subcellular locations of eukaryotic proteins; Mem Type-2L at http://chou.med.harvard.edu/bioinf/MemType/ to identify the membrane protein types; Signal-CF at http://http://chou.med.harvard.edu/ bioinf/Signal-CF/ or Signal-3L at http://chou.med.harvad.edu/bioinf/Sinal-3L/ to identify the signal peptides of proteins in various organisms; EzyPred at http://chou.med.harvard.edu/bioinf/EzyPred/ to identify the enzyme functional class; and Protident at http://www.csbio.situ.edu.cn/bioinf/ Protease/ to identify protease type. These websites are very useful for both basic research and drug development. REFERENCES [1]

Aebersold, R. and Mann, M. Mass spectrometry-based proteomics. Nature, 2003, 422, 198-207.

Proteomics in Cardiovascular Research [2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

Current Proteomics, 2010, Vol. 7, No. 2

Anderson, N.L. and Anderson, N.G. The human plasma proteome: history, character, and diagonistic prospects. Mol. Cell Proteomics, 2002, 11, 845-867. Ping, P.; Zhang, J.; Pierce, W.M. and Bolli, R. Functional proteomic analysis of protein kinase C {epsilon} signaling complexes in the normal heart and during cardioprotection. Circ. Res., 2001, 88, 59-64. Bugger, H.; Chen, D.; Riehle, C.; Soto, J.; Theobald, H.A.; Hu, X.X.; Ganesan, B.; Weimer, B.C. and Abel, E.D. Tissue-specific remodeling of the mitochondrial proteome in Type 1 diabetic Akita mice. Diabetes, 2009, 58, 1986-1997. Grant, J.E.; Brashaw, A.D.; Schwacke, J.H.; Balcu, C.F.; Zile, M.R. and Schey, K.L. Quantification of protein expression changes in the aging left ventricle of Rattus norvegicus. J. Proteome Res., 2009, 8, 4252-4263. Jeon, H.B.; Choi, E.S.; Yoon, J.H.; Hwang, J.H.; Chang, J.W.; Lee, E.K.; Choi, H.W.; Park, Z.Y. and Yoo, Y.J. A proteomics approach to identify the ubiquitinated proteins in mouse heart. Biochem. Biophys. Res. Commun., 2007, 357, 731-736. Agnetti, G.; Bezstarosti, K.; Dekkers, D.H.W.; Verhoeven, A.J.M.; Giordano, E.; Guarnieri, C.; Caldarera, C.M.; van Eyk, J.E. and Lamers, J.M.J. Proteomic profiling of endothelin-1-stimulated hypertrophic cardiomyocytes reveals the increase of four different desmin species and -B-crystallin. Biochim. Biophys. Acta, 2008, 1784, 1068-1076. Kislinger, T.; Gramolini, A.O.; MacLennan, D.H. and Emili, A. Multidimensional protein identification technology (MudPIT): Technical overview of a profiling method optimized for the comprehensive proteomic investigation of normal and diseased heart tissue. J. Am. Soc. Mass Spectrom., 2005, 16, 1207-1220. Kislinger, T.; Cox, B.; Kannan, A.; Chung, C.; Hu, P.; Ignatchenko, A.; Scott, M.S.; Gramolini, A.O.; Morris, Q.; Hallett, M.T.; Rossant, J.; Hughes, T.R.; Frey, B. and Emili, A. Global survey of organ and organelle protein expression in mouse: combined proteomic and transcriptomic profiling. Cell, 2006, 125, 173-186. Faber, M.J.; Dalinghaus, M.; Lankhuizen, I.M.; Bezstarosti, K.; Dekkers, D.H.W.; Duncker, D.J.; Helbing, W.A. and Lamers, J.M.J. Proteomic changes in the pressure overloaded right ventricle after 6 weeks in young rats: correlations with the degree of hypertrophy. Proteomics, 2005, 5, 2519-2530. Bezstarosti, K.; Das, S.; Lamers, J.M.J. and Das, D.K. Differential proteomic profiling to study the mechanism of cardiac pharmacological preconditioning by resveratrol. J. Cell. Mol. Med., 2006, 10, 896-907.

Received: August 31, 2010

[12]

[13] [14]

[15] [16]

[17] [18] [19] [20]

[21]

[22] [23]

[24] [25]

115

Gingras, A.C.; Gstaiger, M.; Raught, B. and Aebersold, R. Analysis of protein complexes using mass spectrometry. Nat. Rev. Mol. Cell Biol., 2007, 8, 645-654. Ping, P.; Zhang, J.; Pierce Jr., W.M. and Bolli, R. Functional proteomic analysis of protein kinase  signaling complexes in the normal heart and during cardioprotection. Circ. Res., 2001, 88, 59-62. Faber, M.J.; Dalinghaus, M.; Lankhuizen, I.M.; Bezstarosti, K.; Verhoeven, A.J.M.; Duncker, D.J.; Helbing, W.A. and Lamers, J.M.J. Time dependent changes in cytoplasmic proteins of the right ventricle during prolonged pressure overload. J. Mol. Cell. Cardiol., 2007, 43, 197-209. Lilley, K. and Friedman, D.B. All about DIGE: quantification technology for differential-display 2D-gel proteomics. Expert Rev. Proteomics, 2004, 1, 401-409. Dekkers, D.H.W.; Bezstarosti, K.; Gurusamy, N.; Luijk, K.; Verhoeven, A.J.M.; Rijkers, E.J.; Demmers, J.A.; Lamers, J.M.J. and Das, D.K. Identification by a differential proteomic approach of the induced stress and redox proteins by resveratrol in the normal and diabetic rat hearts. J. Cell. Mol. Med., 2008, 12: 1677-1689. Domon, B. and Aebersold, R. Mass spectrometry and protein analysis. Science, 2006, 312, 212-217. Heck, A.J.R. and Krijgsveld, J. Mass spectrometry-based quantitative proteomics. Expert Rev. Proteomics, 2004, 1, 317-326. Eaton, P. Protein thiol oxidation in health and disease: techniques for measuring disulfides and related modifications in complex protein mixtures. Free Radic. Biol. Med., 2006, 40, 1889-1899. Wu, W.W.; Wang, G.; Baek, S.J. and Shen, R.F. Comparative study of three proteomic quantitative methods, DIGE, cICAT, and iTRAQ, using 2D Gel- or LC-MALDI TOF/TOF. J. Proteome Res., 2006, 5, 651-658. Witze, E.S.; Old, W.M.P.; Resing, K.A. and Ahn, N.G. Mapping protein post-translational modifications with mass spectrometry. Nat. Methods, 2007, 4, 798-806. Nesvizhskitt, A.I. and Aebersold, R. Interpretation of shotgun data. Mol. Cell Proteomics, 2005, 4, 1419-1440. Bennett, K.P.; Bergeron, C.; Acar, E.; Klees, R.F.; Vandenberg, S.L.; Yener, B. and Plopper, G.E. Proteomics reveals multiple routes to the osteogenic phenotype in mesenchymal stem cells. BMC Genomics, 2007, 8, 380-388. Colinge, J.; Masselot, A.; Giron, M.; Dessingy, T. and Magnin, J. OLAV towards high-throughput anem mass spectrometry data identification. Proteomics, 2003, 3, 1454-1463. Futcher, B.; Latter, G.I.; Monardo, P.; McLaughlin, C.S. and Garrels, J.I. Mol. Cell. Biol., 1999, 19, 7357-7368.

Revised: November 23, 2009

Accepted: February 17, 2010