02lc master 200-213

5 downloads 8978 Views 245KB Size Report
Aug 22, 2014 - MS will be the rate-limiting factor in pro- teomics in the ... differ in degree of glycosylation but not ... are easier to separate than proteins, the slow.
200

LCGC VOLUME 19 NUMBER 2 FEBRUARY 2001

www.chromatographyonline.com

Multidimensional Chromatography and the Signature Peptide Approach to Proteomics Advances in genomics during the past decade have drawn enormous attention to the ability to obtain greater amounts of information in as little time as possible. Although these advances represent great gains, they pale in comparison to the dynamic complexity of the next generation of study — the proteome. Until recently, gel electrophoresis and chemical sequencing have dominated proteomics, but the field of proteomics now has a new group of tools, which includes the identification and monitoring of up- and down-regulation of proteins. Affinity chromatography, mass spectrometry, and bioinformatics allow users to qualitatively and quantitatively identify the thousands of potential signature peptides generated by a proteome tryptic digest.

P

Fred Regnier, Ahmad Amini, Asish Chakraborty, Ming Geng, Junyan Ji, Larry Riggs, Cathy Sioma, Shihong Wang, and Xiang Zhang Department of Chemistry, Purdue University, West Lafayette, Indiana 47907 Address correspondence to F. Regnier.

rotein sequencing has interested life scientists for more than half a century. The first great breakthrough in sequencing polypeptides came with Edman’s development of a chemical method that allowed single amino acids to be cleaved sequentially from the amino terminus of polypeptides (1). This process evolved into what came to be known as gas-phase sequencing (2), which permitted the sequencing of a peptide of only 10–20 amino acids in a few hours. However, sequencing a protein takes much longer. Deoxyribonucleic acid (DNA) sequencing, in contrast, is much faster. Today most DNA sequencing is achieved by the Sanger method (3). In this approach, replication of a DNA sequence with DNA polymerase is randomly terminated with dideoxy nucleotides that are labeled with four base-specific fluorescent tags (4). Capillary electrophoresis of the resulting sequence ladders in entangled polymer gels allows size-based separations of components as large as 1000 bases long in 1 h (5,6). New laser fluorescence detection methods permit several hundred capillaries to be monitored simultaneously (7), and now they also enable the building of instruments with megabase sequencing capacity. Using these highthroughput instruments, analysts will

sequence the genomes of many organisms in the next few years. Among these genomes are the roughly 100,000 genes in the human genome, genomes of most domestic animals and plants, and the DNA sequence of many pathogens. The availability of DNA sequence data from so many organisms is transforming the way we think about protein sequencing and expression. First, the gene sequence suggests the proteins that an organism is capable of expressing. This statement does not mean, however, that all the proteins in a genome are expressed at the same time. Also, both pre- and posttranslational processing produce proteins in which sequence segments suggested by the gene are deleted, recombined, or cleaved (8). Moreover, posttranslational modifications can add structural elements ranging from carbohydrates (9,10) and lipids (11) to methyl (12,13), hydroxyl (14), or phosphate (15,16) groups. Thus, establishing whether the gene truly reflects its protein structure is an important element of protein characterization. Genomics has given rise to the concept of a proteome, which is the protein equivalent of the genome (17). Yet the proteome differs from the genome in several important ways, other than the fact that proteins are a translated version of genes. One way is that the

202

LCGC VOLUME 19 NUMBER 2 FEBRUARY 2001

proteome generally is smaller in higher plants and animals than the genome. Although the human genome contains 100,000 or more genes, an estimated 20,000 proteins are expressed in a particular type of cell at any time (18). All proteins need not be expressed in every cell type in an organism at all times. Another difference is that the proteome varies over time. The proteome is dynamic, because it is defined by a combination of the genome, the environment at the moment, and even the cell history. Therefore cells do not have a single, fixed proteome. During differentiation, regulation, and in response to environmental stimuli, temporal variations take place in the amount of different proteins in the cellular milieu. Sometimes these changes occur within minutes. The challenge is to develop analytical methods that recognize the small numbers of proteins in concentration flux in an enormous protein background. Analytical Chemistry and the Proteome

How do workers deal analytically with problems of the complexity described above? This question has spawned the new field of proteomics, which is roughly defined as the simultaneous examination of large numbers of proteins (14). The initial approach to proteomics has been through two-dimensional (2-D) gel electrophoresis and mass spectrometry (MS). Proteins first are purified by 2-D gel electrophoresis and visualized by staining, and the stained spots excised from the gel, either manually or automatically by a robot when thousands of spots are being examined. Trypsin then is added individually to the thousands of excised pieces of gel, and the resulting peptide fragments are extracted after 24 h of digestion and characterized by MS (19). Proteins are identified subsequently by comparing the mass of peptide fragments with predictions from either DNA or protein databases. The fundamental difference between this approach and the procedures of a decade ago is the use of MS to rapidly identify peptides through DNA databases, instead of by the laborious chemical sequencing process noted above (20). Because analysts must examine thousands of proteins to define a proteome at any moment, throughput is a significant issue. This concern brings up the question of rate limitations in current methodology. Mass spectra of peptides can generally be obtained in micro- to millisecond time frames with modern mass spectrometers. The mass of multiple peptides can be acquired simultaneously in a single spectrum, 20–50 at a

www.chromatographyonline.com

time in the case of matrix-assisted laserdesorption ionization (MALDI) MS, or roughly 10 in the case of electrospray ionization MS. Mass spectrometers theoretically can analyze hundreds to thousands of peptides per second. Thus, it is doubtful that MS will be the rate-limiting factor in proteomics in the near future. The same will probably be true of data analysis. Modern computers can search databases much faster than proteins can be separated. Hence, the issue of throughput in proteomics clearly is a separation problem, the heart of which is to provide large numbers of peptides to a mass spectrometer at a high rate in bundles of no more than 10–50 peptides at a time. Multidimensional Chromatography and Signature Peptides

Tryptic peptides derived from proteins often can be unique in terms of their mass, separation characteristics, amino acid composition, and sequence. The fact that unique peptides are associated with a single protein allows them to be used as qualitative and quantitative signatures for the protein from which they were derived. These peptides are defined as signature peptides (21). It is common that single proteins will have many signature peptides. However, peptides can provide a generic signature for proteins as well, especially when major portions of the amino acid sequence of a series of protein variants are homologous. Glycoprotein variants that differ in degree of glycosylation but not amino acid sequence are an example. Proteins that have been modified by proteolysis are another case. Peptides that are unique to a variety of species of similar structure are defined as generic signature peptides. With proteins in which speciation occurs by posttranslational modification, it is possible that only a single signature peptide is unique to a species. For this reason, analytical techniques that select peptides on the basis of posttranslational modification are extremely valuable in protein speciation studies. Still another case is that of composite peptide signatures. When peptides unique to a particular protein are unavailable, generally the composite properties of two or more peptides will be unique to a specific protein. This composite signature approach has been used widely in the case of 2-D gel electrophoresis in which proteins generally are identified by the mass of multiple tryptic peptides. The signature peptide approach to proteomics differs from the 2-D gel strategy in several significant ways. One is that all proteins in a sample are digested tryptically first

and then separated instead of separating the parent proteins before proteolysis (21). A second major difference is that smaller numbers of peptides will be used in a protein’s identification. Separation properties, amino acid composition, and the sequence of single peptides are more important (22). The advantages of this approach are that peptides are easier to separate than proteins, the slow tryptic digestion step is performed only once, protein solubility is less of an issue, protein variants are easier to analyze, and 2-D chromatography of peptides may hold the key to delivering very large numbers of peptides to mass spectrometers quickly. The disadvantages of this strategy are that initial proteolysis increases the complexity of the sample that must be fractionated 50–100-fold and all the fragment peptides of a protein no longer will be in the same fraction at the end of the separation phase. We will examine issues dealing with the enormous complexity of cellular or blood serum tryptic digests first. Tryptic digests of cells easily could contain a million peptides based on the fact that cells have thousands of proteins and each protein will provide many tryptic fragments. A premise of the signature peptide strategy is that many more peptides are generated during proteolysis than are needed for protein identification. This assumption means that large numbers of peptides potentially could be eliminated, but analysts still would have enough for protein identification. Geng and colleagues (21) used affinity chromatography selection of peptides on the basis of low-abundance amino acid residues to decrease the number of peptides in tryptic digests. Histidine and cysteine are two low-abundance amino acids that are easily selected, even when they are parts of a polypeptide. Using DNA databases, analysts can compute that roughly 96.4% of all proteins expressed by E. coli yield at least one histidine-containing tryptic peptide, and 85.7% of all proteins will provide one or more cysteine-containing peptides (see Table I and ftp://ncbi.nlm.nih.gov/genbank/ genomes/bacteria). The fraction of cysteinecontaining peptides is even higher in humans. Surprisingly, 58.8% of all genes in E. coli code for at least one peptide that contains both histidine and cysteine. Yet, of all the tryptic peptides produced from these proteins, the database predicts that only 17.8% contain a histidine residue, and 9.6% would have a cysteine residue. Applying affinity selection strategies for either histidine- or cysteine-containing peptides would allow a rapid reduction in sample complex-

204

LCGC VOLUME 19 NUMBER 2 FEBRUARY 2001

www.chromatographyonline.com

ity in a single step and retain peptides from almost all potential proteins in the proteome. Moreover, amino acid–specific selection methods would simultaneously provide data about amino acid composition for peptide identification. Selection methods for tryptophan-, methionine-, and cysteine-containing peptides from pure proteins also are known (23,24). These methods are based on the derivatization of specific amino acid residues with an affinity tag that is then selected with a protein receptor such as an antibody or avidin. Another strategy is direct selection based on unique properties of the amino acid. Researchers have used immobilized metal-affinity chromatography (25) to select histidine-containing polypeptides (26). Metal-chelating stationary phases loaded with copper have a high affinity for histidine-containing peptides (Figure 1a) (27). Nonspecific binding of nonhistidinecontaining peptides to the immobilized

metal-affinity column can be problematic, however, because of association of the primary amine at the amino terminus (N terminus) of peptides with carboxyl groups in the chelator. This problem is greatly reduced by acetylation (Figure 1b) (28). Even though copper shows high affinity for histidine-containing residues, some peptides with a single histidine fail to be captured. According to Table I, large numbers of histidine-containing peptides are anticipated in tryptic digests of cell extracts. These samples are far too complex to expect resolution with a single immobilized metalaffinity column. Adding a second separation dimension of different selectivity, such as reversed-phase chromatography, addresses this problem. Direct transfer of analytes between these two columns can be achieved easily. Coupling these columns in series allows peptides eluted from the immobilized metal-affinity column with an imida-

Table I: Relationship between E. coli proteins and their tryptic peptides based on an

analysis of the genome E. coli Protein

Number of Proteins

Total number Histidine-containing Cysteine-containing Histidine- and cysteine-containing

4289 4136 3674 2524

0.08 0.06 0.04 0.02 0.00 0

(b)

25

50

75

0.08 0.06

Absorbance (215 nm)

Absorbance (280 nm)

135,792 24,187 13,117 4,652

0.10

(a)

Absorbance (280 nm)

Number of Tryptic Peptides

zole gradient to be concentrated at the beginning of the reversed-phase column. After analyte transfer, the columns should be uncoupled, and the reversed-phase column should be gradient-eluted separately. Repeating this process several times allows multiple fractions from the immobilized metal-affinity column to be separated by the second column. The reversed-phase chromatogram in Figure 2 is from an E. coli digest in which all peptides captured by the immobilized metal-affinity column were transferred to the reversed-phase column as a single fraction (24). Using long columns and a shallow gradient, analysts can obtain peak capacities of 300 or more. Although absorbance at 215 nm suggests that individual peaks are being eluted, most MALDI mass spectra almost always have multiple peptides (Figure 3). Generally, MALDI MS shows no more than 20 peptides in any E. coli fraction. However, some components of a mixture might be unnoticeable because of quenching in MALDI MS (29). We noted the biological importance of posttranslational modification above. Researchers have a substantial need for analytical methods that differentiate between structural variants created by posttranslational modification. In the work described in this article, it was our objective to develop methods that determined whether a

0.05

0.00

0.04 0.02 2 0.05

0.00 0

25

50

75

Time (min) Figure 1: Immobilized metal-affinity chromatography chromatograms of (a) native and (b) acetylated tryptic peptides from ovalbumin. A 1-mg/mL tryptic digest was loaded onto an immobilized metal-affinity chromatography column with a gradient of 1–20 mM imidazole.

0

15

30

45

60

Time (min) Figure 2: The reversed-phase chromatogram of an E. coli tryptic digest fraction eluted from the immobilized metal-affinity chromatography column. A 2-mg/mL concentration of the digest was loaded onto an immobilized metal-affinity chromatography column and transferred onto a reversed-phase chromatography column with a gradient of 1–20 mM imidazole. The digest was resolved on the reversed-phase column with a gradient of buffers A (1% acetonitrile, 0.1% trifluoroacetic acid) and B (95% acetonitrile, 0.1% trifluoroacetic acid).

206

LCGC VOLUME 19 NUMBER 2 FEBRUARY 2001

www.chromatographyonline.com

modification had occurred at a particular site and not the extent of modification within the added moiety. Phosphorylation and glycosylation are among the more abundant forms of posttranslational modification. Phosphorylation can occur at one or more sites in the same protein. It generally occurs on serine, threonine, or tyrosine, and it is thought to have multiple functions (30). Posewitz and Tempst (31) showed that immobilized metal-affinity columns loaded with gallium have high selectivity for phosphopeptides (Figure 4). This asset is valuable when trying to identify phosphopeptides from a single protein or from a crude extract; for

example, phosphopeptides selected from a wheat extract (Figure 5). This technique will be particularly valuable in studying the regulation of phosphorylation in mixtures and at specific sites within a single protein. Glycoproteins are common in eukaryotes, but the nature of the carbohydrate modification can vary widely between species, cell types within a species, and even cellular compartments (32). One of the simplest cases is that in which a single N-acetylglucosamine residue is coupled with either a serine or threonine residue of a protein at one or more sites. This type of glycosylation has been found almost exclusively in the nucleus and is thought to be

(a) 1.5 1677.3881

Counts (3 102 4)

1.0

0.5 1989.3753 1636.0913 1543.7356 2301.4600 1361.7716 1682.0661 1777.2797 2090.3842 1330.2833 1660.5875 1699.3102 2147.3216

2706.3616

0.0 1200

1400

1600

(b)

1800

2000

2200

2400

2600

2273.01 9.0 8.0

Counts (3 102 3)

7.0 6.0 5.0 2316.24

4.0 3.0 2.0 2067.91 1.0 0.0 1800

2000

2200

2400

2600

Mass (m/z) Figure 3: MALDI mass spectra of E. coli fractions taken from the reversed-phase chromatography column.

involved in the regulation of transcription factors (33). The BS-II lectin from Bandeiraea simplicifolia binds to O-glycosylated proteins and peptides (34). As described on a National Institutes of Health web site (ftp://ncbi.nlm.nih.gov/ genbank/genomes/bacteria), researchers used affinity columns with a BS-II stationary phase to select N-acetylglucosamine– containing tryptic peptides from nuclear extracts that were then transferred to and separated by reversed-phase chromatography (Figure 6 [21]). The resulting chromatogram is relatively simple for a cellular extract. Data (not shown) from both 2-D gel electrophoresis and affinity chromatography of the native proteins suggest that these peptides arise from fewer than 50 proteins. The negative peaks in the chromatogram are unusual but repeatable. Their origin is undetermined, and they are not observed in blank gradients. Figure 7 shows the MALDI mass spectrum from one column fraction (21). The peptide at mass 2305 is suspected to arise from the protein p53 tumor suppressor based on databases that indicate this protein would produce an N-acetylglucosamine peptide of the observed mass. We anticipate that the simplicity and speed of the signature peptide approach will provide a valuable tool for examining the regulation of transcription factors by glycosylation with N-acetylglucosamine. Blood serum is another source known to be rich in glycoproteins. When serum is trypsin digested, it produces a complex mixture of glycopeptides that can be selected with a concanavalin A affinity chromatography column and further fractionated by reversed-phase chromatography (35). These glycopeptides generally are heterogeneous within the oligosaccharide portion of the molecule. When a protein is glycosylated at a single site, each glycopeptide variant is a signature peptide from a particular protein species. This result causes peaks to be broad and partially resolved in their reversedphase chromatogram. Furthermore, the degree of heterogeneity can be under the control of environmental and genetic variables. Although oligosaccharide heterogeneity generally presents an analytical problem, Yoshitake and co-workers (24) exploited it to identify known proteins. Major problems arise in identifying unknown glycoproteins of unknown oligosaccharide structure and heterogeneity, particularly when peptides are being identified from DNA databases by mass. Because both the oligopeptide and oligosaccharide

208

LCGC VOLUME 19 NUMBER 2 FEBRUARY 2001

www.chromatographyonline.com

0.3

1

portions of the glycoconjugate are unknown, analysts find it easiest to eliminate the oligosaccharide variable by enzymatic deglycosylation after affinity selection with a lectin (32). The peptide resulting from deglycosylation is an example of a generic signature peptide in that it is common to all the oligosaccharide variants at a particular site in a glycoprotein. Although all information about the nature of the oligosaccharide portion of the conjugate is lost in deglycosylation, the resulting peptides originally were glycosylated and, therefore, must have a sequence that allowed glycosylation. This asset is of great help in searching databases.

2

Absorbance (220 nm)

0.2

0.1

Identification of Signature Peptides

The basic premise of the signature peptide strategy is that a small number of peptides can be used to identify a protein. The validity of this concept can be examined with the database from the E. coli genome (Table II). The number of cases in which tryptic peptides from different proteins have the same sequence in E. coli is 37.0%, primarily because of small peptides (36). The incidence of sequence homology drops to 4.6% in peptides with molecular weights greater than 500. If this analysis is restricted to histidine-containing peptides, the number drops to 0.6%. In larger peptides, sequence overlap occurs primarily between proteins that are related in some way; that is, the

0.0

0

5

10

15

20

25

30

35

40

Time (min) Figure 4: The reversed-phase chromatogram of phosphopeptides eluted from the immobilized metal-affinity chromatography column. 100 mg of each phosphopeptide was loaded onto the gallium immobilized metal-affinity chromatography in 0.1% acetic acid, washed with 0.1% acetic acid–30% acetonitrile, eluted to a PepMapC18 column (Applied Biosystems, Foster City, California) with 0.075% ammonium hydroxide and resolved with a gradient of buffers A (1% acetonitrile, 0.1% trifluoroacetic acid) and B (80% acetonitrile, 0.1% trifluoroacetic acid). Peaks: 1 5 Ac-Ile-Tyr(PO3H2)-Gly-Glu-Phe-NH2, 2 5 Ac-Asp- Tyr(PO3H2)-Val-Pro-Met-Leu-NH2.

Absorbance (214 nm)

Absorbance (214 nm)

0.15

Time (min)

0.10

0.05

0.00

2 0.05 0

Figure 5: The reversed-phase chromatogram of a wheat tryptic digest eluted from a galliumloaded immobilized metal-affinity chromatography column. 300 m g of wheat digest was loaded onto the gallium immobilized metalaffinity chromatography column following the same protocol as in Figure 4.

5

10

15

20

25

30

Time (min) Figure 6: The reversed-phase chromatogram of BS-II selected tryptic peptides from a nuclear extract (21).

210

LCGC VOLUME 19 NUMBER 2 FEBRUARY 2001

www.chromatographyonline.com

The higher the mass accuracy, the more often a signature peptide can be defined by mass alone. Amino acid composition and partial sequence also are extremely useful pieces of information for identifying peptides. Amino acid composition can be derived from the selection method, such as the cases of histidine or cysteine selection described above. This is a great asset in database searches because all peptides that do not contain the selected amino acid can be eliminated as candidates. Furthermore,

peptide is a generic signature peptide (37). From this analysis we can conclude that peptides with molecular weights greater than 500 are more likely to be signature peptides. Mass accuracy is another issue. Ion cyclotron resonance mass spectrometers measure mass and have an accuracy of 1 ppm or less (38). By comparison, most MALDI instruments provide mass accuracies of 10–100 ppm (39,40). Mass accuracy has a major effect on analysts’ ability to identify peptides from databases (Table III).

2103.7861

500

N-glycosylation and O-glycosylation occur only at particular sites within specific sequences. The same is true of phosphorylation. Lectins that capture specific types of glycosylated peptides provide information about their amino acid composition and sequence (32). Isotopic labeling techniques discussed below yield the numbers of lysine residues in a peptide and differentiate between lysine and arginine at the C terminus of a peptide (24). Carboxypeptidase digestion of peptides in conjunction with MALDI MS is very useful in that it yields the amino acid sequence at the C terminus of peptides (34). Collision-induced dissociation (CID) and postsource decay are other methods to obtain sequence information (41). Figures 8a and 8b illustrate the identification of glycopeptides from human serum by carboxypeptidase cleavage and CID, respectively (21).

400 Counts

Quantification

One of the major problems in proteomics is recognizing the small number of proteins in concentration flux in a background of thousands of other proteins that are static. Creating an internal standard for every constituent in the mixture, even for unknowns, would be one way to simplify

300 3407.5484 3083.6463 2839.2737 3227.5968

200

100 2000

2500

3000

3500

4000

Mass (m/z) 768.486 771.51

Table II: Number of tryptic peptides having the same amino acid composition but different sequences Catalog or Mass Region (Da)

All Tryptic Peptides

Histidine-Containing Tryptic Peptides

,500

19,274 15,779 757 12 2 0

615 858 19 3 0 0

500–1000 1000–1500 1500–2000 2000–2500 .2500

Table III: Effect of mass accuracy on the number of signature peptides and proteins iden-

tified from signature peptides using mass alone Mass Accuracy (ppm)

Number of Signature Peptides

Number of Identified Proteins

0.1 1 10 100 1000

47,302 35,592 6,545 598 47

4,224 4,192 2,915 521 45

(b) Counts (3 102 3)

Figure 7: MALDI mass spectrum of a fraction containing BS-II selected tryptic peptide from a nuclear extract (21).

Counts (3 102 4)

(a) 2.5 2.0 1.5 1.0 0.5 0.0 750 1.5

760 770 780 739.21 745.252

790

1.0 0.5 0.0 720 730 740 750 760 770 Mass (m/z)

Figure 8: MALDI mass spectrum of (a) arginine-containing and (b) lysine-containing peptides selectively deuterated by acetylation. The peptides were reacted with acetoxysuccinimide and deuterium-labeled acetoxysuccinimide, respectively (21). The same aliquot from unlabeled and deuterium-labeled sample was mixed. The fraction was resolved using buffers A (1% acetonitrile, 0.1% trifluoroacetic acid) and B (80% acetonitrile, 0.1% trifluoroacetic acid) and analyzed by MALDI–time-of-flight MS under refractive mode.

212

LCGC VOLUME 19 NUMBER 2 FEBRUARY 2001

www.chromatographyonline.com

that also contain lysine with an «-amino group are acetylated twice and appear as doublets separated by 6 amu (Figure 8b). Each additional lysine adds 3 amu to the difference in mass between the doublet peaks. Changes in concentration between the two samples are determined by isotope ratio measurements (Figure 9). Peptides in these spectra were derived from E. coli grown under different conditions. Because the relative concentrations of most of the proteins in the two samples are the same, it is likely that the CD3–CH3 ratio of more than 95% of all peptides in the mixed tryptic digest will be the same. The isotope ratio of peptide in Figure 9a is representative of most observed peptides. Peptides that vary from this ratio will have been either up- or down-regulated. The spectrum in Figure 9b shows a peptide that was up-regulated. Apffel and colleagues (35) recently described a similar procedure for quantification in which a cysteine alkylating agent is deuterated and tagged with biotin. They treated control and experimental samples in much the same fashion as described above, except that they achieved differential labeling of the samples by cysteine alkylation and remixing before proteolysis. Subsequent to proteolysis, they selected cysteine peptides by avidin and separated them before

relative concentration measurements (24). The problem, however, is in creating internal standards for thousands of unidentified components in a mixture. In the signature peptide approach to proteomics, a common feature of all peptides is that they have a primary amino group at the amino terminus of the peptide. The exception to this rule would be peptides arising from the amino terminus of N-terminally blocked proteins. A strategy for creating internal standards for each tryptic peptide in a mixture of one million or more peptides is to derivatize the peptide at the primary amine by differential isotopic labeling. A control sample acylated with trideuteroacetate and an experimental sample acylated with acetate and mixed in equal concentrations, creates this type of internal standard mixture. The isotopically labeled form of the peptide in the control sample is the internal standard for the isotopically labeled form in the experimental sample. The two isotopically different forms of the peptide from the control and experimental samples are analyzed using a mass spectrometer, unresolved by the liquid chromatography columns. Peptides with a single primary amine at the N terminus will appear as doublets, separated by 3 amu, in the mass spectrum (Figure 8a). Peptides with a primary amine at the N terminus

(a)

1677.5755

100

1680.5654

Intensity (%)

80 60 40

1676.3891

20

1687.6573 1685.2546 1690.8007

1671.7977 0 1671.0

1675.6

(b)

1684.8

1689.4

1294.5737

100 Intensity (%)

1680.2

MS analysis. The advantage of this approach is that it eliminates variations in proteolysis between samples. The disadvantage is that it allows the examination of only cysteine-containing peptides. In many cases, posttranslationally modified peptides will contain no cysteine, so they cannot be studied. The Future

The discussion presented in this article suggests that separation systems and the manner in which they are applied will play a major role in the evolution of proteomics. We anticipate that specific selection methods will increase in importance. Exploiting affinity selectors to dial-in specific features of the proteome for study will be very valuable. The ability to rapidly examine glycosylation and phosphorylation patterns in transcription factors, signaling systems, and receptors certainly will advance our understanding of regulation. Another area of opportunity will use the chromatographic behavior of peptides and their mass to define signature peptides. Although it is impossible to derive peptide sequences from databases using chromatographic behavior alone, differentiation between candidates of similar mass is greatly enhanced by including their chromatographic properties. Finally, the issue of throughput is important. Multidimensional chromatographic systems hold the key to high-throughput proteomics, either by differential selection or massive resolving power. If it were possible to process all the peptides from thousands of proteins in a few hours, throughput would be increased 10–100-fold. Furthermore, the future of proteomics goes beyond processing and identifying huge numbers of peptides. Quantification is critical. Only with quantitation will it be possible to monitor cellular dynamics. It will become apparent within the next few years whether the isotopic labeling strategies described in this article can address this issue.

80

Acknowledgments

60

The authors greatly acknowledge financial support from National Institutes of Health grant GM-5996 (Bethesda, Maryland) and from PE Biosystems (now Applied Biosystems, Foster City, California).

40

1297.5754

20 0 1285.0

1289.4

1293.8

1298.2

1302.6

Mass (m/z) Figure 9: Tryptic peptides from E. coli extracts that are (a) unchanged in concentration and (b) up-regulated as a result of differences in growth conditions.

References (1) P. Edman and G. Begg, Eur. J. Biochem. 1, 80–91 (1967). (2) R. Hewick, M. Hunkapiller, L. Hood, and J. Dreyer, J. Biol. Chem. 256, 7990–7997 (1981).

FEBRUARY 2001 LCGC VOLUME 19 NUMBER 2

www.chromatographyonline.com

(3) F. Sanger, S. Nicklen, and A. Coulson, Proc. Nat. Acad. Sci. 74, 5463–5467 (1977). (4) R. Wilson, C. Chen, N. Avdalovic, J. Burns, and L. Hood, Genomics 6, 626–634 (1990). (5) H. Zhou, A. Miller, Z. Sosic, B. Buchholz, A. Barron, L. Kotler, and B. Karger, Anal. Chem. 72, 1045–1052 (2000). (6) K. Yongseong and E. Yeung, J. Chromatogr. A 781, 315–325 (1997). (7) J. Zhang, D. Chen, H. Harke, and N. Dovichi, Chromatogr. Sci. Ser. 64, 631–641 (1993). (8) R. Krishna and F. Wold, in Proteins, R. Angeletti, Ed. (Academic Press, San Diego, California, 1998), pp. 126–132. (9) N. Jentoft, Trends Biochem. Sci. 15, 291–294 (1990). (10) T. Hardingham and A. Fosang, FASEB J. 6, 861–870 (1992). (11) P. Casey and J. Buss, Meth. Enzymol. 250, 1–754 (1995). (12) B.L. Sheid and L. Pedrinan, Biochemistry 14, 4357–4361 (1975). (13) R. Swanson and M. Applebury, J. Biol. Chem. 258, 10,599–10,605 (1983). (14) K. Kivirikko, R. Myllyla, and T. Philajaniemi, in Posttranslational Modification of Proteins, J. Harding and M. Crabbe, Eds. (CRC Press, Boca Raton, Florida, 1992), pp. 1–51. (15) T. Hunter and J. Cooper, Ann. Rev. Biochem. 54, 897–930 (1985). (16) T. Martensen, Meth. Enzymol. 107, 3–23 (1984). (17) K. Williams, Electrophoresis 20, 678–688 (1999).

(18) E. Celis and P. Gromov, Electrophoresis 10, 16–21 (1999). (19) J. Yates, J. Mass Spectrom. 33, 1–19 (1998). (20) R. Van Bogelen, E. Schiller, J. Thomas, and F. Neidhard, Electrophoresis 20, 2149–2159 (1999). (21) M. Geng, J. Ji, and F. Regnier, J. Chromatogr. A 870, 295–313 (2000). (22) S. Paterson, D. Thomas, and R. Bradshaw, Electrophoresis 17, 877–891 (1996). (23) G. Barnard, E. Bayer, M. Wilchek, Y. AmirZaltsman, and F. Kohen, Meth. Enzymol. 133, 284–288 (1986). (24) S. Yoshitake, Y. Yamada, F. Ishikawa, and R. Masscyeff, Eur. J. Biochem. 101, 395–399 (1979). (25) J. Porath, I. Carlsson, I. Olsson, and G. Belfrage, Nature 258, 598–599 (1976). (26) Y. Nakagawa, T. Yip, M. Belew, and J. Porath, Anal. Biochem. 75, 168 (1981). (27) J. Ji, A. Chakraborty, M. Geng, X. Zhang, A. Amini, M. Bina, and F. Regnier, J. Chromatogr. B 745, 197–210 (2000). (28) P. Hensen, G. Lindeberg, and L. Anderson, J. Chromatogr. 627, 125–135 (1992). (29) S. Dormandy, Ph.D. dissertation, Purdue University, West Lafayette, Indiana, 2000. (30) J. Posada and J. Cooper, Mol. Biol. Cell. 3, 583–592 (1992). (31) M.C. Posewitz and P. Tempst, Anal. Chem. 71, 2883–2892 (1999). (32) H. Schachter, Glycobiology 1, 453–461 (1991). (33) G. Hart, Ann. Rev. Biochem. 66, 315–335 (1997).

213

(34) K. Zhu, R. Bressan, P. Hasegawa, and L. Murdock, FEBS Lett. 390, 271–274 (1996). (35) A. Apffel, J. Chakel, W. Hancock, C. Souders, T. Timkulu, and E. Pungor, J. Chromatogr. A 732, 27–42 (1996). (36) Xiang Zhang, personal communication, 1 November 1998. (37) M. Geng, X. Zhang, and F. Regnier, J. Chromatogr. B, in press. (38) D. Goodlett, J. Bruce, G. Anderson, B. Rist, L. Pasa-Tolic, O. Fiehn, R. Smith, and R. Aebersold, Anal. Chem. 72, 1112–1118 (2000). (39) K. Clauser, P. Baker, and A. Burlingame, Anal. Chem. 71, 2871–2882 (1999). (40) L. Li, R. Garden, E. Romanova, and J. Sweedler, Anal. Chem. 71, 5451–5458 (1999). (41) D. Patterson, G. Tarr, F. Regnier, and S. Martin, Anal. Chem. 67(21), 3971–3978 (1995). n

CAMAG 1/4 Page Ad

Pneutronics 1/4 Page Ad

Circle 88

Circle 89