poster abstracts

0 downloads 0 Views 1MB Size Report
ing for posttranslational modification (PTM) or sequence .... using Web-based server-client architecture with intra- and ..... available through the portal include:.
abrf 2007

poster abstracts

poster abstracts P2-M

B i o i n f o r m at i c s

P1-S

On the Manipulation and Comparison of Protein and Peptide Identification Results from MS Data: Walking on Eggs in the Format Jungle P. A. Binz1,2 , N. Budin1, A. Niknejad1, A. Masselot1; 1GeneBio, Geneva, Switzerland, 2 Swiss Institute of Bioinformatics, Geneva, Switzerland. A number of protein and peptide identification software tools based on MS data are available to the proteomics researchers. They all share a common functionality: they process MS data and present in their output peptides and proteins that best match with the input data. Even if restricting to sequence search engines one can observe heterogeneity of approaches, of algorithms, of input parameters, of the use of available sequence databases, of output information (scores, confidence levels, details of interpretation, etc.) and of possibilities to export results. The results obtained from different tools also vary both from the content and the form point of view. It is a challenge for the bioinformatics to help lab-researchers in manipulating results obtained from replicate analyses or from submissions made to multiple search engines. Here we present our approach to represent side-by-side results from different MS/MS identification results. We expose elements of the difficulty to get appropriate exports from different search engines and to map the provided information, in order to align it in a single interface. We address questions such as: which export format from each tool is the most useful to perform alignment of results; how to align proteins and peptides coming from two different sequence databases (NCBInr and SwissProt, for instance); how to interpret protein grouping in separate queries; how to identify that proteins are the same if the sequence is not present in the result, or whether any of the database identifiers are different, etc. As an illustrative example, we show how we convert outputs from Phenyx, Mascot, Sequest, or X!Tandem into the Phenyx result comparison feature and more. We will also show how this effort will contribute to and profit from the development of AnalysisXML, a HUPO PSI standard XML format to capture results from protein and peptide identification results.

Systematic PTM Analysis of Protein Kinases Using LC-MS/MS Data D. C. Chamrad1, S. Bailey1, A. Wattenberg1, C. Beisenherz-Huss2 , R. Graeser2 , D. Müller2 , M. Blueggel1; 1Protagen AG, Dortmund, Germany, 2ProQinase GmbH, Freiburg, Germany. Available peptide fragmentation interpretation software is focused on sequence database–driven protein identification, rather than on primary structure elucidation. Searching for posttranslational modification (PTM) or sequence errors currently needs time-critical manual intervention and evaluation. Here we describe the results and performance of a novel interpretation software, which was used to characterize more than 50 recombinant serine/threonine kinases, receptor tyrosine kinases, or cytoplasmatic tyrosine kinases. LC-MS/MS data was acquired after tryptic digestion of the recombinantly produced kinases. The datasets have been imported to the proteome bioinformatics platform ProteinScape. The spectra were screened for a set of modifications, amino acid substitutions, unsuspected large measurement errors, enzyme no-specificity, and unknown mass shifts. The software restricts the search space by testing only sequences of interest. In widely used sequence database searches, testing all modifications and possible nonspecific cleavages is not feasable. Besides the increase in sequence coverage basically caused by detection of one side non-specifically cleaved peptides, numerous modifications were found—namely, phosphorylation. methylation, pyroglutamate formation, methinonine oxidation, and N-terminal acetylation. As spectra of phosphorylated peptides are almost always in the minority compared to their unmodified counterparts, their detection is a challenge, but internal significance analysis revealed a substantial amount of phosphorylation.The phenomenon of auto-phosphorylation of kinase proteins was successfully monitored. The phosphorylation sites are categorized according to their sequence motive, and additionally their distribution is compared to phosphoryation sites described in public databases. Using this software triggered by the proteome database software proteinscape, searches were performed in a highly automated manner. Manual analysis could be reduced to minutes for the LC-MS/MS datasets containing more than 1000 spectra. Integrated result presentation strategies, which use clustering of spectra results on the amino acid level to annotate the protein sequence of interest, avoided the the possibility of seeing excess PTM contained in the large amount of acquired spectra.

Key to Abstract Numbering Prefixes: P, Poster; RG, Research Group; SP, Scientific Session Presenter; EP, Educational Session Presenter. Following the hyphen is the designated presentation day: S, Sunday; M, Monday; T, Tuesday.



Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007



poster abstracts

P3-T

Quality Control of 2D-Electrophoresis Images Prior to Analysis A. Borthwick, W. Hudson, M. Lambert, D. Miller, D. Bramwell; Nonlinear Dynamics, Newcastle upon Tyne, United Kingdom. Management of your image capture process is critical for the success of 2D-electrophoresis gel analysis. Ensuring the consistent quality of scanned images is most problematic for large projects involving multiple users at multiple sites. This lack of control will impact on the subsequent detection and quantitation of the images, thus reducing the statistical power of the resulting data analysis. Most image problems are well known and their causes well understood; for example: incorrect image types, color or compressed images, low spatial resolution, varying bit depth, large and variable image file sizes, low or inconsistent use of dynamic ranges, large border variation, uncontrolled image noise, using the same image more than once, and multiple image orientations. To help the investigator reduce the occurrence of such problems, we have investigated methods of automatic detection, and propose a standardized way of assessing and rejecting images to avoid subjective decisions that lead to bias. We will show examples of how this process can be used to improve the performance of your capture workflow and thus ensure that you get the best out of your 2D image data. P4-S

The Mouse Genome Informatics Database: An Integrated Resource for Mouse Genetics and Genomics C. J. Bult, J. Blake, J. Kadin, J. Eppig, M. Ringwald, J. Richardson, M. Group; The Jackson Laboratory, Bar Harbor, ME, United States. The Mouse Genome Informatics (MGI; http://www. informatics.jax.org/) database integrates genetic and genomic data with the primary mission of facilitating the use of the mouse as a model system for understanding human biology and disease processes. MGI is the authoritative source of official mouse genetic nomenclature, gene ontology annotations, mammalian phenotype annotations, and mouse anatomy terms. MGI staff enforce the use of standardized genetic nomenclature, ontologies, and controlled vocabularies to describe mouse sequence data, genes, strains, expression data, alleles, and phenotypes. Extensive links between gene-centric information in MGI and other informatics resources (e.g., OMIM, Ensembl, UCSC, NCBI, UniProt) are maintained and updated on a regular basis. Using the Web-based query interfaces for MGI, users can query for a mouse gene or genes according to diverse biological attributes of those genes, including phenotype associations, gene expression, functional annotation, and genome location. The MGI MouseBLAST server allows users to interrogate the MGI database using nucleotide and/ or protein sequences. Functional and phenotypic data from MGI can be viewed in a broader genomic context using



abrf 2007

an interactive genome browser called Mouse GBrowse. The power of the MGI database as a research tool for biomedicine stems from the degree to which data from diverse sources are integrated. Integration, in turn, allows the data to be evaluated in new contexts. For example, integration makes possible such complex queries as “Find all genes from Chromosome 1 where the function is annotated as transcription factor and there is a knockout allele that results in eye dysmorphology.” The MGI project is supported by NHGRI HG00330, NCI CA89713, and NICHD HD33745. P5-M

Proteinscape—Software Platform for Managing Proteomics Data D. C. Chamrad1, M. Blueggel1, G. Koerting1, J. Glandorf 2 , J. Vagts2 , P. Hufnagel 2 , H. Thiele2; 1Protagen AG, Dortmund, Germany, 2Bruker Daltonik GmbH, Bremen, Germany. Proteomics inherently deals with huge amounts of data. Current mass spectrometers acquire hundreds of thousands of spectra within a single project. Thus, data management and data analysis are a challenge. We have developed a software platform (Proteinscape) that stores all relevant proteomics data efficiently and allows fast access and correlation analysis within proteomics projects. The software is based on a relational database system using Web-based server-client architecture with intra- and Internet access. Proteinscape stores relevant data from all steps of proteomics projects—study design, sample treatment, separation techniques (e.g., gel electrophoresis or liquid chromatography), protein digestion, mass spectrometry, and protein database search results. Gel spot data can be imported directly from several 2DE-gel image analysis software packages as well as spot-picking robots. Spectra (MS and MS/MS) are imported automatically during acquisition from MALDI and ESI mass spectrometers. Many algorithms for automated spectra and search result processing are integrated. PMF spectra are calibrated and filtered for contaminant and polymer peaks (Scorebooster). A single non-redundant protein list—containing only proteins that can be distinguished by the MS/MS data—can be generated from MS/MS search results (ProteinExtractor). This algorithm can combine data from different search algorithms or different experiments (MALDI/ ESI, or acquisition repetitions) into a single protein list. Navigation within the database is possible either by using the hierarchy of project, sample, protein/peptide separation, spectrum, and identification results, or by using a gel viewer plug-in. Available features include zooming, annotations (protein, spot name, etc.), export of the annotated image, and links to spot, spectrum, and protein data. Proteinscape includes sophisticated query tools that allow data retrieval for typical questions in proteome projects. Here we present the benefit and power of usage of 6 years of continuous use of the software in over 70 proteome projects managed in house.

Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007

abrf 2007

poster abstracts

P6-T

Tranche: Secure Decentralized Data Storage for the Proteomics Community J. A. Falkner, P. C. Andrews; University of Michigan, Ann Arbor, MI, United States. The number, size, and format variation for proteomics data files (both raw and processed), annotation, as well as challenges in designing a robust data repository are some of the major factors inhibiting public dissemination of proteomics data. Sharing large amounts of data and software is a legitimate need in the field of proteomics and other scientific disciplines as replication of results and the benefits of data reanalysis relies heavily on having access to the original data. Several journals have already published recommendations for providing access to data associated with proteomics manuscripts; however, researchers have been left with the challenge of how to appropriately satisfy the recommendations. Of particular concern is how potentially large datasets (gigabytes to terabytes of raw data) may be efficiently hosted in a publicly accessible fashion. Described here is Tranche, a secured peer-to-peer system (http://www.proteomecommons.org/dev/dfs/), along with a reference implementation, supported by ProteomeCommons.org, that is capable of hosting virtually unlimited amounts of data and supporting virtually unlimited users. Furthermore Tranche solves many of the prominent concerns in data dissemination, including hosting raw data associated with a proteomics experiment and maintaining annotation. It is intended as both a reference implementation and a model system for comparison to other proteomics data dissemination efforts. Tranche currently hosts many prominent proteomics datasets and mirrors of other proteomics data resources, including most all of the publicly available proteomics data. P7-S

Combining Workflow-Based Project Organization with Protein-Dependant Data Retrieval for the Retrieval of Extensive Proteome Information J. Glandorf1, H. Thiele1, M. Macht1, O. Vorm2 , A. Podtelejnikov2; 1Bruker Daltonik GmbH, Bremen, Germany, 2Proxeon Biosystems A/S, Odense, Denmark. In the course of a full-scale proteomics experiment, the handling of the data as well as the retrieval of the relevant information from the results is a major challenge due to the massive amount of generated data (gel images, chromatograms, and spectra) as well as associated result information (sequences, literature, etc.). To obtain meaningful information from these data, one has to filter the results in an easy way. Possibilities to do so can be based on GO terms or structural features such as transmembrane domains, involvement in certain pathways, etc. In this presentation we will show how a combination of a software package with a workflow-based result organization (Bruker ProteinScape) and a protein-centered data-mining software (Proxeon ProteinCenter) can assist in the comparison of the results from large projects, such as comparison of cross-platform results from 2D PAGE/MS with shotgun LC-ESI-MS/MS. We will present differences

between different technologies and show how these differences can be easily identified and how they allow us to draw conclusions on the involved technologies. P8-M

Bioinformatic Analysis of Neural Stem Cell Differentiation L. A. Goff1, R. Hart1, R. Jornsten1, S. Keles2; 1Rutgers University, Piscataway, NJ, United States, 2University of Wisconsin, Madison, WI, United States. We analyzed mRNAs regulated during differentiation of rat neural stem cells using the ABI1700 microarray platform. This microarray, while technically advanced, suffers from the difficulty of integrating hybridization results into public databases for systems-level analysis. This is particularly true for the rat array, since many of the probes were designed for transcripts based on predicted human and mouse homologs. Using several strategies, we increased the public annotation of the 27,531 probes from 43% to over 65%. To increase the dynamic range of annotation, probes were mapped to numerous public keys from several data sources. Consensus annotation from multiple sources was determined for well-scoring alignments, and a confidence-based ranking system established for probes with less agreement across multiple data sources. Previous attempts at genomic interpretation using the Celera annotation model resulted in poor overlap with expected genomic sequences. Since the public keys are more precisely mapped to the genome, we could now analyze the relationships between predicted transcription factor binding sites and expression clusters. Results collected from a differentiation time course of two neural stem cell clones were clustered using a model-based algorithm. Transcription factor binding sites were predicted from upstream regions of mapped transcripts using position-weight matrices from either JASPAR or TRANSFAC, and the resulting scores were used to discriminate between observed expression clusters. A classification and regression tree analysis was conducted using cluster numbers as gene identifiers and TFBS scores as predictors, pruning back to obtain a tree with the lowest gene class prediction error rate. Results identify several transcription factors, the presence or absence of which are sufficient to differentiate clusters of mRNAs changing over time from those that are static, as well as clusters describing cell-line differences. Public annotation of the ABI1700 rat genome array will be valuable for integrating results into future systems-level analyses.

Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007



poster abstracts

P9-T

ProteinExtractor—From Peptide ID to Protein ID G. Koerting1, C. Stephan 2 , K. Marcus2 , D. C. Chamrad1, P. Hufnagel3, U. Schweiger-Hufnagel3, J. Glandorf3, H. E. Meyer2 , H. Thiele3, M. Blueggel1; 1Protagen AG, Dortmund, Germany, 2Medizinisches Proteom Center, Bochum, Germany, 3Bruker Daltonik GmbH, Bremen, Germany. In proteomics workflows, proteins are often digested first, then peptides are separated and subjected to identification by mass spectrometry (e.g., 2D-LC). In this process the peptide assignment to a protein is lost and has to be rebuilt by bioinformatic methods. We present ProteinExtractor, a module of the ProteinScape Bioinformatics Platform, which uses an empiric, iterative method to derive minimal protein lists from peptide search results, which may even come from different search algorithms or different MS datasets. ProteinExtractor uses an iterative approach to generate a minimal protein list. With composite database searches ProteinExtractor allows measuring the false-positive rate of the protein list. A test dataset (five recombinant proteins, 408 spectra, Bruker Ultraflex), and a real-life dataset (200410 LC/ESI-MS/MS spectra, Bruker Esquire HCTUltra, and 11619 LC/MALDI-MS/MS spectra, Bruker Ultraflex, both obtained from an analysis of proteins from a human cell line—SW480) were analyzed. The most probable protein sequence entries contained in the test dataset were identified with intensive manual data interpretation by several mass spectrometry experts. Using standard search algorithms, the correct protein sequence database entries are scattered over the first 171 protein ranks. Together with application specialists, we developed a set of rules to define a minimal protein list containing only those proteins (and isoforms) that can be unequivocally distinguished on the basis of MS/MS data. Applying these rules, the correct five proteins are ranked within the top eight protein candidates. In the real-life dataset, the peptide search results of Mascot, Sequest, Phenyx, and ProteinSolver were merged using ProteinExtractor. Merging all four search algorithms, over 50% more proteins could be identified than by using Mascot alone (with a false-positive rate of less then 2.5%). Merging ESI and MALDI data together, another 25% more proteins could be identified. P10-S

A Novel Protein Database Search Algorithm for 1D and 2D LC-MS Data G. Li, D. Golick, B. Dyson, J. C. Silva, H. Liu, J. P. C. Vissers, J. C. Gebler, M. V. Gorenstein, S. J. Geromanos; Waters Corporation, Milford, MA, United States. A novel, “Ion Accounting” algorithm has been developed for protein identification using time-resolved, LCMSE data from 1D and 2D LC-MS experiments. The data from a 1D LC-MS analysis generate a series of precursorproduct tables that are initially queried against a protein database using the “Ion Accounting” algorithm. Hereby each precursor and product is associated with only single



abrf 2007

peptide identification. The database search is a hierarchal process containing three modules. With the first module, the data are matched to only correctly cleaved proteolytic peptides whose precursor and product ion mass tolerances are within 10 and 20 ppm, respectively. With the second module, precursor and product ions that have not yet been assigned are queried against a subset database of the identified proteins from the first module. The second module includes missed cleavages, in-source fragments, neutral losses, and variable modifications. With the last module, the remaining unidentified ions are considered against the complete database for additional protein identifications (including PMF) with improved selectivity and specificity from the elimination of those precursor and product ions from the first two modules. The data from a 2D LC-MS separation of proteolytic peptides is conducted by fractionating the peptides in a first dimension and subsequent separation in a second dimension during the LC-MSE analysis. Each fraction produces a series of precursor-product tables. From these tables, the peptides (precursor-products) that were not distributed over multiple fractions are saved to a “Combined PrecursorProduct Table” (CPPT). The peptides that are split among neighboring fractions are combined by precursor mass, precursor retention time, and product ion pattern, and are appended to the CCPT. The final CCPT is submitted to the “Ion Accounting” protein database search engine in a similar fashion to the 1D LC-MS data analysis. P11-M

Comprehensive and Reliable Proteome Analysis Using Bioinformatic Strategies for Automated Result Validation M. Macht1, C. Albers1, K. Sparbier2 , A. Asperger1, J. Glandorf1, H. Thiele1; 1Bruker Daltonik GmbH, Bremen, Germany, 2Bruker Daltonik GmbH, Leipzig, Germany. Proteomic analyses typically produce massive amounts of mass spectrometric data, which are analyzed in an automated way by database search engines for retrieval of peptide sequences and subsequent inference on the corresponding protein sequences. However, this process turned out to be error prone, producing false positives and multiple hits for the same proteins for various reasons. In this study we analyzed the human serum glycosubproteome. For this, glycoproteins were produced in five separate extractions by affinity interaction chromatography using magnetic beads coated with the lectins ConA, WGA, LCA, and AIA, as well as with boronic acid. The eluates from the beads were digested using trypsin and subsequently analyzed by LC-MALDI-MS/MS as well as LCESI-MS/MS. The analyses were carried out as up to triple replicates. All the data were submitted to the ProteinScape database system. Within ProteinScape, the datasets were searched against Mascot and Phenyx. The resulting peptide identifications were analyzed by the ProteinExtractor tool to reduce the list by the false positives using a decoy strategy, and subsequently merged into a combined list of identified proteins for the respective sample preparations. This allows the combination of data from different search engines

Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007

abrf 2007

poster abstracts

as well as from different experiments (ESI and MALDI). The use of decoy strategies as well as application of the ProteinExtractor to overcome the protein inference problem minimizes the need for manual validation (which is nevertheless easily possible using raw spectra information). In parallel, the use of a single data repository allows for easy access to the combined information from different workflows and links to external tools complement the system for project-spanning comparisons of datasets. P12-T

Prediction of Proteotypic Peptide Candidates for MRM Analysis K. Marcus1, R. Reinhardt 2 , E. Langenfeld1, H. E. Meyer1, M. Blüggel2; 1Medizinisches Proteom Center, Bochum, Germany, 2Protagen AG, Dortmund, Germany. The application of multireaction monitoring (MRM) for proteomics analysis is a quite recent development. The sensitivity of four orders of magnitude, reproducibility, and the option of quantification as well as high throughput make MRM a valuable tool for measuring specific proteins. We present a bioinformatics workflow for the determination of MRM candidates that reduces the number of nonproteotypic peptide candidates considerably, enabling the analysis of complex and highly homologous protein families. A peptide candidate has to be unique for its targeted protein in respect to the proteome of the organism. Polymorphisms of the target protein are an important issue, when coverage of all alleles of the target protein is desired rather than rudimentary genotyping by MS. Both aspects can be solved by using annotated protein databases. When using absolute quantification (e.g., the AQUA technique of Kirkpatrick et al.), the ability of synthesis for the stable isotope-labeled peptides has to be taken into account additionally. The selection of suitable fragment ions can be done by evaluating previously acquired spectra or MS/MS fragment prediction. The result has been compared to measurements of human cytochrome P450 (CYP450). CYP450 comprises families (over 200 protein members known today) with low evolutionary conservation and thus high homology. Developing antibodies specific for one protein can be a daunting task; this holds especially true for the human CYP2 family. By targeting only proteotypic peptides using mass spectrometry, antibody-related problems are avoided. As a proof of concept, the analysis of highly polymorphic human CPY2D6 is depicted by ESI-MRM/MS/MS analysis.



P13-S

Database Protein Information Searching Engine via Internet: PIKE J. Medina-Aunon1, M. Macht 2 , A. Quinn3, J. Albar1, J. Glandorf 2 , H. Thiele2; 1ProteoRed, CNB-CSIC, Madrid, Spain, 2Bruker Daltonik GmbH, Bremen, Germany, 3European Bioinformatics Institute, Hinxton, United Kingdom. One of the main goals in proteomics is to extract and collect all the functional information available in existing databases in relation to a defined set of identified proteins. Due to the huge amount of data available, it is not possible to gather up this information by hand; we need to have automatic methods for addressing this task. Protein information and knowledge extractor (PIKE) solves this problem by accessing several public information systems and databases automatically through the Internet and retrieving all functional information available on the different repositories, and then clustering this information according to the pre-selected criteria. The PIKE bioinformatics tool, accessible through http://proteo.cnb.uam. es:8080/pike, uses the Java and XML languages. Starting with a selected group of identified proteins, listed as NCBI nr, uniprot, and/or ipi (http://www.ebi.ac.uk/IPI/IPIhelp. html) accession codes, PIKE retrieves all relevant information stored in databases by choosing the correct pathway and/or the best information source. Once the search is done, a typical PIKE output shows a report table with an entry for each protein containing all extracted information. The report contains a large amount of meaningful protein features, such as (1) function information, (2) sub-cellular location, (3) tissue specificity, (4) links with other repositories, such as Mendelian Inheritance in Man (OMIM) or Kyoto Encyclopaedia of Genes and Genomes (KEGG), and (5) gene ontology tree classification. The table is exportable in CSV and text file formats, and, more important, it is possible to export it in PRIDE XML (http://www.ebi.ac.uk/pride/) format for results integration into the information stored in other applications such as ProteinScape. P14-M

PRIME: Proteome Research Information Management Environment For High-Throughput Proteomics Laboratories P. G. Papoulias, D. Lentz, P. C. Andrews; University of Michigan, Ann Arbor, MI, United States. Proteomics laboratories and proteomics service facilities produce a large volume of data that originate from diverse sources, involve the participation of several members of the lab, and are communicated to clients and collaborators of the laboratory. Laboratory information management systems (LIMS) play a key role in collating and tracking the flow of data through the laboratory. The cost of off-theshelf LIMS systems can be significant and compounded by further customization, development, and installation costs. Fields that employ rapidly changing technologies in support of research require information management that is stringent and secure, but also must be adaptable.

Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007



poster abstracts

We have developed an open-source scientific data management system for proteomics that is capable of acting as a stand-alone LIMS system, or working in conjunction with other LIMS systems. We support sample processing workflow, protocols, and datasets generated by LC-ESI, LC-MALDI, MALDI, 1D-GELS, 2D-GELS, and 2D-gel image analysis. The workflow supports input from existing data analysis and allows batch analysis of proteomics data to be queued to search engines and other tools. It supports extensive data curation, has a simple Web browser interface that allows tiered, secure access to data and functions, and also allows export of data from individual spectra up to full project files. The source code is available from Proteomecommons.org. Installation of the software can be performed through https://www.prime-sdms.org/main.htm. P15-T

SameSpots: Validating a Novel Approach to 2D Electrophoresis Image Analysis I. Reah, M. O’Gorman, A. Borthwick, D. Miller, D. Bramwell; Nonlinear Dynamics Ltd, Newcastle upon Tyne, United Kingdom. 2D-PAGE experiments can provide a powerful means of investigating protein expression behavior in a cell or tissue across different disease states or other experimental conditions. Traditional analysis of 2D experiments typically requires a large amount of post-detection editing in order to prepare the data for statistical research. With the application of SameSpots, all protein spots are fully matched and thus no missing values exist. This enables a more robust statistical exploration of the data with a reduction in subjective editing. The SameSpots workflow has a semi-automatic gel alignment step and uses the arcsinh transform introduced by Huber et al.1 for inter-gel calibration and variance stabilization (VSN). To validate these techniques we repeated the work of Nishihara and Champion2 and Karp and Lilley.3 The results of this study demonstrate that gel alignment has no adverse affect on spot volume quantitation, and we also show that VSN outperforms traditional normalization and variance stabilizing methods used in 2D gel analysis. references

abrf 2007

P16-S

Database Search of High Mass Resolution Data R. G. Sadygov, V. Zabrouskov; ThermoFisher Scientific, San Jose, CA, United States. The work describes modifications to and application of Sequest database search algorithms to identify peptides from their high mass accuracy tandem mass spectra. We show the technical problems encountered when attempting to use high mass accuracy data with the original Sequest algorithm. To overcome the problems, modifications have been made to the algorithm. The modifications are such that the results from the unmodified and modified algorithms are the same for the unit mass accuracy data. The work presents advantages in terms of the speed and reduced memory requirements of the modified algorithm. We apply the modified algorithm to a dataset obtained from site-specific digestion of a set of known proteins. The results demonstrate the effect of high mass accuracy on such characteristics of the Sequest scoring as cross-correlation score and delta-Cn. Based on the correlations between the mass accuracy and the Sequest scores, we attempt to identify optimal conditions for the mass accuracy to be applied in the database searches. A comparative analysis with the dataset of the same proteins, but under unit mass accuracy conditions, will also be presented. We analyze results obtained from searches of the spectra against the reversed databases under the high mass accuracy conditions. P17-M

Improving Sensitivity by Combining Results from Multiple MS/MS Search Methodologies with the Scaffold Computer Algorithm B. C. Searle, M. Turner; Proteome Software, Inc., Portland, OR, United States. Database-searching programs generally identify only a fraction of the spectra acquired in a standard LC/MS/MS study of digested proteins. Subtle variations in databasesearching algorithms of MS/MS spectra have been known to provide different identification results. To leverage this variation, we developed Scaffold to probabilistically combine the results of multiple search engines, including

1. Huber W, Von Heydebreck A, Sültmann H, Poustka A, Vingron M. Variance stabilization applied to microarray data calibration and to the quantification of differential expression. Bioinformatics 2002;18 suppl. 1:S96–S104 2. Nishihara JC, Champion KM. Quantitative evaluation of proteins in one- and two-dimensional polyacrylamide gels using a fluorescent stain. Electrophoresis 2002;23:2203–2215 3. Karp NA, Lilley KS. Maximising sensitivity for detecting changes in protein expression: Experimental design using Minimal CyDyes. Proteomics 2005;5 (12):3105.





Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007

abrf 2007

poster abstracts

Sequest, Mascot, and X!Tandem. Here we present a “tell all” explanation of the specific methodology behind Scaffold that converts scores into search engine independent peptide probabilities. These probabilities can be readily combined across search engines using Bayesian rules and the Expectation Maximization learning algorithm. We demonstrate how we normally gain 20% to 100% more highly confident (>95%) MS/MS spectrum identifications with each additional search engine, which is primarily due to increased confidence in low-scoring matches. We also show that this method works reliably across a variety of search engines and instrumentation platforms without re-tuning. P18-T

MassSieve: A New Tool for Mass Spectrometry– Based Proteomics D. J. Slotta, M. McFarland, A. Makusky, S. Markey; NIMH/NIH, Bethesda, MD, United States. The success of peptide sequence assignment algorithms such as OMSSA and Mascot for mass spectrometry has led to the need for a tool to evaluate the results. DBParser is such a software tool, previously developed by the Laboratory of Neurotoxicology (LNT) lab for this purpose. Its value for parsimonious analysis of proteins associated with experiments has led to its use for analyzing larger data­sets than initially anticipated (hundreds of data files with millions of spectra). MassSieve builds on this experience and is designed as open-source protein assignment software that can be scaled to apply parsimony principles to very large experiments without dataset size limitations. In addition it allows a more interactive view of the results. P19-S

Managing Proteomics Data from Data Generation and Data Warehousing to Central Data Repository and Journal Reviewing Processes H. Thiele1, J. Glandorf1, G. Koerting 2 , K. Reidegeld 3, M. Blüggel2 , H. Meyer3, C. Stephan3; 1bruker Daltonik GmbH, Bremen, Germany, 2Protagen AG, Dortmund, Germany, 3Medical Proteom-Center, Ruhr Universität Bochum, Germany. In today’s proteomics research, various techniques and instrumentation bioinformatics tools are necessary to manage the large amount of heterogeneous data with an automatic quality control to produce reliable and comparable results. Therefore a data-processing pipeline is mandatory for data validation and comparison in a data-warehousing system. The proteome bioinformatics platform ProteinScape has been proven to cover these needs. The reprocessing of HUPO BPP participants’ MS data was done within ProteinScape. The reprocessed information was transferred into the global data repository PRIDE. ProteinScape as a data-warehousing system covers two main aspects: archiving relevant data of the proteomics workflow and information extraction functionality (protein identification, quantification and generation of biological knowledge). As a strategy for automatic data validation, dif-



ferent protein search engines are integrated. Result analysis is performed using a decoy database search strategy, which allows the measurement of the false-positive identification rate. Peptide identifications across different workflows, different MS techniques, and different search engines are merged to obtain a quality-controlled protein list. The proteomics identifications database (PRIDE), as a public data repository, is an archiving system where data are finally stored and no longer changed by further processing steps. Data submission to PRIDE is open to proteomics laboratories generating protein and peptide identifications. An export tool has been developed for transferring all relevant HUPO BPP data from ProteinScape into PRIDE using the PRIDE.xml format. The EU-funded ProDac project will coordinate the development of software tools covering international standards for the representation of proteomics data. The implementation of data submission pipelines and systematic data collection in public standards–compliant repositories will cover all aspects, from the generation of MS data in each laboratory to the conversion of all the annotating information and identifications to a standardized format. Such datasets can be used in the course of publishing in scientific journals. P20-M

NCI CGEMS Data Portal: Sharing Data for Genome-Wide Association Studies L. Yang; National Cancer Institute, Rockville, MD, United States. A new NCI initiative, Cancer Genetic Markers of Susceptibility (CGEMS), is a three-year study designed to identify common genetic variations associated with risk for prostate and breast cancer. CGEMS will analyze the entire genome for the most common type of genetic variation, the single-nucleotide polymorphism (SNP). By studying large populations of individuals with and without disease, the CGEMS research can provide powerful indicators as to which SNP variations are associated with each disease. This study design is especially valuable for unraveling the genetic origins of complex diseases such as prostate and breast cancer. A critical requirement of the CGEMS project is to share raw data and analysis results from the study with the cancer research community. The NCI Center for Bioinformatics (NCICB), in collaboration with other NCI research groups, has built the CGEMS data portal to support data sharing of the CGEMS project (https://caintegrator.nci.nih. gov/cgems/). The first whole genome scan includes approximately 1200 prostate cancer cases and 1200 controls. The datasets available through the portal include: • Association test results for over 300,000 SNPs • Frequency and descriptive statistics on these 300,000 SNPs • Individual phenotypic and genotypic data for the study participants and control samples. Note that these data can be made available only to eligible investigators after a registration process.

Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007



poster abstracts

CGEMS data portal development has leveraged the caIntegrator application framework, developed at NCICB. It shares a common set of application programming interfaces (APIs) and specification objects that support the clinical genomic analysis services. This allows fast development of Web-based query functionalities on all the data objects from the CGEMS project.

Biomarkers

P21-T

The Application of Multivariate Model Building to Derive Predictive ‘Signatures’ from Proteomics Data D. Bramwell1, I. Morns1, M. O’Gorman1, S. Hoving 2 , B. Wiedmann2 , H. Voshol 2; 1Nonlinear Dynamics, Newcastle upon Tyne, United Kingdom, 2Novartis Institutes for BioMedical Research, Basel, Switzerland. Objective: To apply advanced statistical model-building procedures to derive proteomic “signatures” from 2D gels and validate the approach by predicting double-blinded samples. Methods: A large experiment was used to explore the power of the predictive modeling process (340 samples, 18 groups, seven double-blinded and three unknown). The images were geometrically corrected and then analyzed at a pixel level. On completion of this procedure, areas important for obtaining good group discrimination were automatically identified. The areas were ranked and visually examined. Up to 10 per group were selected for the next stage of analysis. The 117 resultant spots were then used to build predictive models. The models were explored in the context of the experiment and also for their prediction performance. This process enables the selection of candidate spots that may be below standard univariate thresholds (such as p < 0.05, 1.5-fold change). Results: Models were successfully built that gave perfect performance on the training sets. The blind samples were successfully predicted and interesting information on the unknown samples was produced and is the subject of further experimentation. The effective “systems” dimension for the 18 group sets was estimated to be 12, which suggests we may have more groups than is supported by the data. A “minimal spot set” was calculated and showed a saturation in prediction performance at around 60 spots. A follow-on procedure was employed to choose the best spots for group discrimination and also to specify the spot number vs. performance relationship. Conclusion: Proteomics data provide a rich source for advanced statistical modeling techniques, and using standard double-blind procedures can add an intuitive confidence to the experimental results. The techniques are very powerful in assisting in the exploration of the complex relationships intrinsic to the data.





abrf 2007

P22-S

LC-MALDI Top-Down Biomarker Profiling and Identification S. Brand, M. Meyer, S. Hahner, D. Suckau; Bruker Daltonik GmbH, Bremen, Germany. The search for new and validated biomarkers is of particular interest in clinical areas such as oncology, neurology, toxicology, and pharmacology. One of the challenges in finding the right technology for biomarker research is to combine a statistically reasonable throughput—hundreds of samples like serum, plasma, cell lysate, and urine—with an in-depth proteome technology. As proteolytic events play a significant role, in particular in disease-related events, biomarker discovery approaches may benefit from a topdown profiling approach, as proteolytic isoforms remain intact during the analysis. We present the combination of sample preparation based on magnetic nanoparticles purification (wax) or other pre-separation methods with the high-resolution HPLC-MALDI analysis of the undigested peptides and proteins. Proof-of-principle experiments included 36 samples in three groups spiked with different concentrations of a marker peptide. Multivariate statistics (PCA) achieved a proper grouping of the samples and detected the spiked material correctly from the complex matrix. Subsequent MS/MS spectra allowed the identification. First experiments clearly demonstrate that this technology significantly increases the number of detectable signals received from human serum (>1500) and is very reproducible. Therefore, this approach opens the door for highthroughput, in-depth analysis of clinical samples for the detection of biomarkers. Furthermore the marker detection does not require previous identification (i.e., uncommon structures can qualify to be markers) and the overall MS/MS workload is extremely reduced. Finally the reduction of protein complexity per fraction after LC-MALDI separation allows the use of a simple and fast method for the identification of biomarker candidates: in situ digestion provides for protein identification directly from LC runs collected on MALDI targets. Thus we narrowed the gap between detection of biomarker candidates and their final identification. This identification is mandatory for validation and for any further diagnostic use of biomarkers. P23-M

Limit of Quantitation for Low-Abundance Proteins in Plasma by Targeted MS M. Burgess, H. Keshishian, E. Kuhn, T. Addona, S. A. Carr; Broad Institute, Cambridge, MA, United States. Biomarker discovery results in the creation of candidate lists of potential markers that must be subsequently verified in plasma.1 The most mature methods at present require abundant protein depletion and fractionation at the protein/peptide levels in order to detect and quantitate low ng/mL concentrations of plasma proteins by stable isotope dilution mass spectrometry. Sample-processing methods with sufficient throughput, recovery, and reproducibility to enable robust detection and quantitation of candidate biomarker proteins were evaluated by adding five non-native

Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007

abrf 2007

poster abstracts

proteins to immunoaffinity-depleted female plasma at varying concentrations (1000, 100, 50, 25, and 10 ng/mL). Each protein was monitored by one or more representative synthetic tryptic peptides labeled with [13C6]leucine or [13C5] valine. Following reduction, carbamidomethylation, and enzymatic digestion, two separate processing paths were compared. In path 1, digested plasma was diluted 1:10 and [13C] internal standards were added just prior to direct analysis by multiple reaction monitoring with LC-MS/MS (MRM LC-MS/MS). In path 2, peptides were separated by strong cation exchange, and [13C] internal standards were added to corresponding SCX fractions prior to analysis by MRM LC-MS/MS. Detection and quantitation by MRM used the response of at least two product ions from each of the signature peptides. Using processing path 1, we achieved detection and quantitation down to 50 ng/mL in depleted plasma. However, using processing path 2, we achieved detection and quantitation of all spiked proteins, including the non-native protein at 10 ng/mL. While analysis of non-fractionated plasma achieved higher recovery of those proteins detected in both processes, SCX fractionation at the peptide level clearly increases detection and LOQs for potential biomarker proteins in plasma. reference

1. Rifai N, Gillette MA, Carr SA. Protein biomarker discovery and validation: The long and uncertain path to clinical utility. Nat Biotechnol 2006;24:971–983.

P24-T

Quantitative Proteomics of Formalin-Fixed Archival Tissue M. Darfler1, B. Hood 2 , T. Guiel1, T. Conrads3, T. Veenstra 3, D. Krizman1; 1Expression Pathology Inc., Gaithersburg, MD, United States, 2SAIC-Frederick, Inc., Frederick, MD, United States, 3SAIC-Frederick, Frederick, MD, United States. Capabilities for quantitative mass spectrometry–based proteomic profiling of formalin-fixed archival tissue have been developed. Using Liquid Tissue reagents and protocols, we have effectively profiled and validated known and novel cancer biomarkers across a wide variety of fixed cancer tissue samples. Recently developed applications include quantification of protein expression in fixed archival tissue with an emphasis on oral cavity (head and neck squamous cell carcinoma—HNSCC) and breast cancer. Spectral count bioinformatics provides a non-labeling method for quantitative proteomics of formalin-fixed histologically-defined oral cavity cancer. Results indicate many proteins were differentially expressed in cells obtained by laser microdissection of normal, highly differentiated, moderately differentiated, and poorly differentiated HNSCC fixed tissue. Candidate protein biomarkers found to be differentially expressed in the process of HNSCC progression were confirmed and validated by immunohistochemistry on large panels of HNSCC tissue. In addition, we have developed and evaluated a method for direct detection and absolute quantification by selected reaction monitoring (SRM) of HER2 directly in formalin-fixed paraffin-embedded

breast cancer tissue using a stable isotope standard peptide derived from HER2. Soluble protein extracts from a collection of breast cancer tissues known to express a range of HER2 were prepared using Liquid Tissue reagents, and quantitative levels were determined. Results demonstrate the ability to quantitate HER2 expression in Liquid Tissue extracts from fixed tissue sections that correlate with standard IHC and indicate the ability to quantify HER2 in immunohistochemical-negative cells. These cumulative results demonstrate development of technologies for quantitative proteomic analysis of proteins that can be applied to the vast worldwide formalin-fixed tissue archives. P25-S

Gene Expression Profiling from Formalin-Fixed, Paraffin-Embedded Tissues Using the QuantiGene Branched DNA Assay J. Davies, B. Maqsodi, W. Yang, Y. Ma, Y. Luo, G. McMaster; Panomics, Inc., Fremont, CA, United States. Large numbers of formalin-fixed, paraffin-embedded (FFPE) human tissue specimens with known clinical outcome are archived worldwide, representing a vast resource for biomarker and gene-disease association studies. However, RNA quality in FFPE tissues is compromised by chemical modifications and extensive fragmentation caused by formalin fixation. As a result, quantifying RNA in FFPE samples can be problematic. Here we describe application of the QuantiGene branched DNA (bDNA) assay to gene expression profiling of FFPE samples. This hybridization-based, signal amplification method measures RNA directly from FFPE tissue homogenates, avoiding variability introduced by RNA extraction as well as biases inherent to reverse transcription and amplification of target sequences. We show that the QuantiGene assay is insensitive to chemical modifications introduced by formalin fixation and can efficiently capture even highly degraded RNA. Small changes in gene expression are reliably measured even in FFPE tissues stored for more than 10 y. Results from a large number of independent studies, in which the QuantiGene assay was used to quantify more than 220 RNA targets in nearly 400 FFPE samples from nine different tissues, will be summarized. Data comparing the accuracy, precision, and sensitivity of QuantiGene and real-time, quantitative PCR (RT-qPCR) assays using FFPE samples will be presented. Additionally, we discuss the usefulness of parallel Quanti­Gene assays for assessment of FFPE tissue homogenate quantity and RNA quality, as well as improved methods for normalization of data from FFPE samples. Finally, we demonstrate the feasibility of simultaneous, multiplexed RNA measurements from FFPE tissue homogenates using the QuantiGene Plex Reagent System. Thus the QuantiGene FFPE Reagent System is a comprehensive solution for gene expression analysis of FFPE tissues, allowing retrospective studies that were previously not feasible.

Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007



poster abstracts

P26-M

Detection of Candidate Protein Biomarkers in Human Serum by Multiple Reaction Monitoring: Improved Limits of Detection and Quantification C. Doneanu, W. Chen, A. Chakraborty, J. Gebler; Waters Corporation, Milford, MA, United States. Mass spectrometry–based biomarker discovery in biofluids produces a list of candidate proteins that must be verified and quantified in a large number of samples before a candidate becomes a useful diagnostic, prognostic, or pharmacodynamic marker. Because of the high sensitivity and specificity provided by multiple reaction monitoring (MRM), this MS/MS method has recently been used for verification and quantification of potential biomakers. However, the wide dynamic range of protein concentrations in serum prohibits direct detection of many useful biomarkers at the concentration level of low nanogram/mL to picogram/mL range without any sample fractionation and/or enrichment. In this presentation, we evaluate the utility of two sample enrichment techniques for improving the limit of detection and limit of quantification (LOQ) for MRM analysis of several candidate protein biomarkers. In the first sample enrichment method, we used immuno-depletion to remove either the six most abundant serum proteins (90% serum depletion) or the twenty most abundant proteins (97% serum depletion) before MRM analysis of low-abundance potential biomarkers. The second sample enrichment method that we evaluated was the glycoprotein affinity enrichment method. Several lowabundance serum proteins were quantified by the MRM method using a triple quadrupole mass spectrometer coupled to a nanoscale liquid chromatograph. The effects of immuno-depletion and affinity enrichment on the LOQ of selected candidate proteins biomarkers in human plasma were compared.



10

abrf 2007

After Animal Use Committee approval, male Fisher 344 rats (250–300 g) housed in individual metabolic cages received a single intraperitoneal injection of 0.25 mmol/ kg FDVE, and all urine was collected daily for 1 wk, as described previously.2 The samples were labeled with iTRAQ reagents, and both the 4800 MALDI TOF/TOF Analyzer and the 4000 Q TRAP system (AB/MDS SCIEX) were used to acquire data in MS and MS/MS modes. Data were processed with MarkerView software and ProteinPilot Software (AB/MDS SCIEX). The results demonstrate that FDVE causes certain alterations in urine protein/peptide excretion. Multiple components were differentially expressed in a time-dependent manner. Excretion of several endogenously excreted proteins was rapidly decreased by FDVE. Other native peptides showed increased excretion following FDVE, and then gradually decreased to pre-dose levels. Excretion of a third set of proteins/peptides, minimally or not detectable in controls, was upregulated following FDVE. Further experiments will be conducted to identify the protein/peptide markers using LC MALDI MS/MS and other technologies to further investigate the usefulness of MS for identifying biomarkers for FDVE nephrotoxicity. Supported by NIH DK53765 references

1. Kharasch ED. Gene expression profiling of nephrotoxicity from the sevoflurane degradation product fluoromethyl-2,2-difluoro1-(trifluoromethyl)vinyl ether (“Compound A”) in rats. Toxicol Sci 2006;90:419–31 2. Sheffels P, Schroeder JL, Altuntas TG, Liggitt HD, Kharasch ED. Role of cytochrome P4503A in cysteine S-conjugates sulfoxidation and the nephrotoxicity of the sevoflurane degradation product fluoromethyl-2,2-difluoro-1-(trifluoromethyl)vinyl ether (“compound A”) in rats. Chem Res Toxicol 2004;17:1177– 89.

P27-T

P28-S

Evaluation of Fluoromethyl-2,2-difluoro-1(trifluoromethyl)vinyl Ether (‘Compound A’) Effects on Urine Protein Excretion in Rats Using Mass Spectrometry K. Dong1, M. S. Minkoff1, J. D. Miller1, E. D. Kharasch 2; 1Applied Biosystems, Framingham, MA, United States, 2Department of Anesthesiology, Washington University, St. Louis, MO, United States. Fluoromethyl-2,2-dif luoro-1-(trif luoromethyl)vinyl ether (FDVE or “compound A”), a haloalkene degradant of the volatile anesthetic sevoflurane, is nephrotoxic in rats. FDVE bioactivation mediates the toxicity, but the molecular and cellular mechanisms of toxification are unknown. FDVE caused rapid and brisk changes in kidney gene expression, providing potential insights into mechanisms of toxicity, and potential biomarkers for nephrotoxicity.1 Nevertheless, it is unknown whether gene-expression changes are reflected in protein expression, or whether such tissue changes would be reflected in excreted urine proteins. This investigation was to evaluate FDVE effects on urine protein excretion using mass spectrometry.

Digging Deeper and Faster into Proteome by IgYImmunoaffinity Fractionation X. Fang1, L. Huang1, S. Sikora1, D. Hinerfeld 2 , S. Tam 2 , P. Gagné3, G. G. Poirier3, C. Kusumoto 4, K. Obata4, D. Q. Yang5, W. Zhang1; 1GenWay Biotech, Inc., San Diego, CA, United States, 2University of Massachusetts Medical School, Shrewsbury, MA, United States, 3Laval University, Québec City, PQ, Canada, 4PSS Bio Instruments, Inc., Livermore, CA, United States, 5PSS Bio Instruments, Inc., Gaithersburg, MD, United States. After separating the highly abundant proteins (HAP) by IgY affinity column, the next layer of abundant protein, moderately abundant proteins (MAP), becomes an obstacle to access the low abundant proteins (LAP), where the majority of biologically relevant and clinically important biomarkers reside. Therefore, isolation of MAP is a new challenge for effective detection and analysis of LAPs. To tackle this challenge, we further developed the IgY-microbead system by immunizing chickens with a flowthrough fraction of IgY12 column and constructing the column with affinity-purified IgY antibodies against the

Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007

abrf 2007

poster abstracts

flow-through proteins of IgY12 column. The column developed, called SuperMix, was applied for further partitioning of the flow-through fraction of IgY12, which resulted in a bound/eluted fraction (designated as MAP fraction) and the flow-through fraction (designated as LAP fraction). Unfractionated and serial-fractionated samples using IgY12 and SuperMix columns were analyzed by SDS-PAGE and 2DE. Our data demonstrate that SuperMix columns specifically and reproducibly remove the post-IgY12 layer of the abundant proteins. A case study using SDS-PAGE coupled with LC/MS/MS demonstrates that the SuperMix column enabled specific capturing of 207 MAP, with 77 proteins being uniquely identified in high confidence (≥95%). This novel approach enables deeper and more effective access into the population of LAPs. In addition to digging deeper with the SuperMix column, we also have progressed in the direction of digging faster. One of the present challenges of plasma biomarker discovery is sample throughput limitations. In collaboration with PSS Bio Instruments, GenWay Biotech has developed a novel multiplex automated system (SepproTip) for high-throughput plasma sample processing. It permits processing of 12 samples at a time. This SepproTip system can process plasma samples using both IgY12 and SuperMix tips. The turnaround time of 12 samples per 65 min allows a large number of samples to be processed without decrease in sample preparation quality. P29-M

Sample Preparation for Body Fluid Profiling using Magnetic Bead Technology D. Gillooly1, J. Knol2 , C. E. Teunissen3, M. Simmelink 2,3, C. R. Jimenez2; 1Invitrogen Dynal AS, Oslo, Norway, 2OncoProteomics Laboratory, VUmcCancer Center, Amsterdam, The Netherlands, 3Dept. of Molecular Cell Biology and Immunology, VU University Medical Center, Amsterdam, The Netherlands. Peptide profiling of biological samples using MALDITOF mass spectrometry (MS) is an increasingly popular approach used for the discovery of biomarkers and as a means to detect and diagnose disease and allow the assessment of disease severity, progression, and the effectiveness of treatments. Before mass spectrometry can be used to generate body fluid peptide profiles, reproducible, preferably automated, sample preparation procedures need to be used in which peptides are enriched and substances which interfere with MS analysis removed. Dynabeads are uniform superparamagnetic monodisperse beads with a specific and defined surface for the adsorption/desorption and coupling of bioreactive molecules. We have developed ion exchange and reversed-phase magnetic beads for peptide isolation from body fluids. These beads enable large numbers of peptides to be processed and analyzed simultaneously. The use of multiple bead surfaces for the capture of peptides from the same sample increases the number of peptides detected in a given sample and may increase the likelihood of discovering novel biomarkers. Sample preparation procedures have been successfully automated on robotic platforms. Using peptide capture automated on the KingFisher96 coupled to MALDI-TOF-MS read-out, we investigated pre

analytical variables and have identified peptides sensitive to differences in clotting time. Furthermore, we have optimized the protocol for capture of native CSF peptides.

P30-T

Mass Spectrometric Analysis of Metals for Biomarker Identification and Confirmation in Biological Model Systems W. Johnson, Jr., L. Dimico, B. Buckley; EOHSI, Piscataway, NJ, United States. Metals are potentially good biomarkers because they cannot be created or destroyed the way organic analytes can. They are frequently an integral part of many biological systems, such as protein binding. One property of metals is their ability to act as a label for small-molecule metabolism. Specifically, many small organometallic molecules can be measured in a biological compartment using the metal as the label. Mancozeb (MZ), a fungicide comprised of an organic backbone and a metal moiety in a 9:1 ratio of Mn to Zn, is neurotoxic. Its metabolite, ethylene thiourea (ETU), is a known teratogen, carcinogen, and anti-thyroid compound. It is one molecule that has both an organic moiety and a metal analyte. Measuring the organic backbone is often more difficult than measuring the metal because the organic moiety appears to be unstable. Measurement of MZ in biological matrixes and demonstration of its uptake by neuronal cells proved difficult by conventional LC/MS. Quantification by liquid chromatography/mass spectrometry of the organic analyte inside the neuronal cells showed that about 8% of the compound crossed the cell membrane. This was significant because no one had measured MZ inside the cell, even though its neurodegenerative properties were known. The results were not completely conclusive because the precursor ion could not be measured inside the cell, only product ions. ICP/MS was used to measure the Mn content inside the cell and discover that about 8% of the MZ had entered the cell. The agreement between the two methods was significant and demonstrated the utility of the metal as a confirmatory analysis. It also confirmed that the Mn had entered the cell and not just the organic backbone. This presentation will focus on the use of metals as labels in biological systems, including analytical methods and stable isotope labeling. P31-S

Identification of Protein Biomarkers Associated with Hypoxia in Human Malignant Glioma Cell Lines Using Proteomics Technologies T. Kempf1, S. Rahn1, U. Warnken1, M. Schnoelzer1, W. Wick 2; 1German Cancer Research Center, Heidelberg, Germany, 2University of Tuebingen Medical School, Tuebingen, Germany. The most common type of primary malignant brain tumors are glioblastomas. These highly aggressive, rapidly growing tumors are exposed to hypoxia, which occurs as a consequence of inadequate blood supply. Hypoxia exerts a

Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007

11

poster abstracts

variety of influences on tumor cell biology. Among these are activation of signal transduction pathways, adapting hypoxic tumor cells to an anaerobic environment. As a complementary approach to gene expression profiling, we aimed to investigate changes in the overall protein pattern of malignant glioma cell lines after hypoxia treatment compared to normoxic controls utilizing various proteomics techniques. The human malignant glioma cell line LNT-229 was initially used in our studies. Three individual samples representing three cell cultures grown under identical conditions were taken at four different time-points of hypoxia treatment (control, 8, 24 and 72 h). Proteins from total cell lysates were separated by high-resolution 2D gel electrophoresis and visualized by silver staining. Computerassisted image analysis of the gels allowed the detection of differentially expressed proteins. Proteins spots that were recognized to be up- or downregulated were identified by mass spectrometry in preparative gels. We have found 14 proteins up-regulated and 6 proteins down-regulated in hypoxia compared to normoxia based on the image analysis of the corresponding 2D gels. These results are currently being verified by Western-blot analysis of samples from three different glioma cell lines— LN-18, U87, and LNT229. Furthermore, all candidate proteins will be evaluated on primary cell cultures from human gliomas, and immunohistochemically on glioma tissue microarrays. P32-M

HT Proteomics LC/MS For Analysis of LowAbundance Proteins, PTMs, and Biomarkers K. Nugent, P. Kent, L. Upton; Michrom Bioresources, Auburn, CA, United States. The introduction of electrospray ionization (ESI) in the 1980s revolutionized the analysis of biomolecules, and LC-ESI/MS (50–5000 µL/min) offers robust, highthroughput analyses for many biological applications. The introduction of nanospray (nESI) in the 1990s has made nLC-nESI/MS (10–1000 nL/min) a valuable tool for proteomics research, where high resolution and high sensitivity are maximized at the cost of sample throughput (60–240+ min per analysis) and robustness. Although protein biomarker analysis requires the high resolution and high sensitivity provided by nanoLC/MS, it also requires robustness and high throughput if it is to be useful in pharmaceutical and clinical applications. This study introduces a new axial desolvation, vacuum-assisted nano-capillary electrospray (ADVANCE) source for LC/MS, which has the robustness of LC-ESI/MS and the sensitivity of nanoLC-nanoESI/MS, but operates in the flow range from 0.2–100 µL/min. For low-abundance proteins, PTMs, and biomarker identification, an HT capLC-ADVANCE/MS system can analyze 50–150 host cell proteomic samples (digests of 1D gel slices, 2D gel spots, or MDLC fractions) in 24 h (10–30 min per analysis) at attomole to picomole sensitivities. The capLCADVANCE/MS system can also be used for validation of biomarker candidates, which requires comparative quantitation (ICAT, ITRAC, SILAC) of proteomic samples (5–15 min per analysis) in host cells and physiological

12

abrf 2007

fluids. Once protein biomarkers have been validated (very few to date), the capLC-ADVANCE/MS can be used for very high throughput (2–5 min per analysis) quantitation of biomarker signature peptides in a variety of diagnostic and therapeutic applications. P33-T

Streamlining Plant Sample Preparation: The Use of High-Throughput Robotics to Process Echinacea Samples for Biomarker Profiling by MALDI-TOF Mass Spectrometry S. A. Schwartz1, I. Issac2 , L. A. Greene1, J. R. Guthrie1, D. Gray1; 1Midwest Research Institute, Kansas City, MO, United States, 2Genomic Solutions, Ann Arbor, MI, United States. Several species in the genus Echinacea are beneficial herbs popularly used for many ailments. The most popular Echinacea species for cultivation, wild collection, and the development of herbal products include E. purpurea (L.) Moench, E. pallida (Nutt.) Nutt., and E. augustifolia (DC). Product adulteration is a key concern for the natural products industry, where botanical misidentification and the potential for introduction of other botanical and nonbotanical contaminants exist throughout the formulation and production process. Therefore, rapid and cost-effective methods that can be used to monitor these materials for complex product purity and consistency are of benefit to consumers and producers. The objective of this continuing research was to develop automated, high-throughput methods to process samples and differentiate Echinacea species by their MALDI-TOF mass profiles. Based on analysis of pure Echinacea samples, off-the-shelf products containing Echinacea could then be evaluated in a streamlined process. Leaf/flower pieces, seeds, and roots from E. purpurea and E. augustifolia; seeds and roots from E. pallida; and off-the shelf Echinacea supplements were extracted in Tris buffer using bead-beating technology. Samples were then transferred, diluted, and deposited on a MALDI target plate for MS analysis using customized methods on a ProPrep liquid-handling system (Genomic Solutions). MS analysis of these samples highlighted key MS signal patterns from both small molecules and proteins that characterized the individual Echinacea samples analyzed. Corresponding analysis of dietary supplements was used to monitor Echinacea sample composition, including species and plant material used. These results highlight the potential for streamlined, automated approaches for agricultural species differentiation and botanical product evaluation. P34-S

The Effect of Brief Exercise on Soluble Adhesion Molecules and Soluble Selectins in Early and Late Pubertal Males C. D. Schwindt; UCI, Orange, CA, United States. It now appears that many of the health effects of exercise are influenced by the balance of stress mediators and

Journal of Biomolecular Techniques, Volume 18, issue 1, february 2007

abrf 2007

poster abstracts

growth factors. Related to these agents are the soluble adhesion molecules (sAM: ICAM and VCAM) and soluble selectins (sS: E, L, and P selectins) which have been linked to such illnesses as asthma and cardiovascular disease. We hypothesized that brief exercise would alter circulating levels of sAM and sS, and that the response would be modified by the child’s pubertal status. Thirty healthy males (14 early pubertal, EP; 16 late pubertal, LP) performed 10 2-min bouts of exercise on a cycle ergometer. Blood was sampled at pre-exercise (PE) and end-exercise (EE). Levels of sAM and sS were analyzed using commercially available ELISAs. Mean PE levels were significantly different between groups for ICAM (p