LC-IMS-MS Feature Finder - PNNL: BTRR Proteomics Resource for ...

2 downloads 38934 Views 261KB Size Report
Decon2LS: An Open-source Software Package for Automated. Processing and Visualization of High Resolution Mass Spectrometry Data,. BMC Bioinformatics, 10:87 ... E-mail: Kevin. ... should give the best possible error in drift time. • Drift time ...
LC-IMS-MS Feature Finder: Detecting Multidimensional Features in LC-IMS-TOF MS Data Kevin L. Crowell, Anuj R. Shah, Gordon W. Slysz, Brian L. LaMarche, Da Meng, Erin S. Baker, Matthew E. Monroe, Vlad A. Petyuk, John D. Sandoval, Gordon A. Anderson, and Richard D. Smith Pacific Northwest National Laboratory, Richland, WA

• Configurable settings file used as an input to allow any parameter of the algorithm to be altered • Outputs results in multiple forms that are compatible with downstream AMT tag pipeline – SQLite (new generation pipeline) – Tab-delimited Text (previous generation pipeline)

Algorithm development

• Output of software will be used for peptide identifications or AMT tag database creation

• Data smoothing implemented to account for features with low signal-to-noise ratios and low abundant features

Introduction

• Report multiple conformations or co-eluting peptides as separate features – Algorithm will not discern between multiple conformations and co-eluting peptides

• Addition of ion mobility to existing data analysis pipeline presents a number of challenges • Extended pipeline is required to accurately identify peptides in LC-IMS-TOF MS data • Extended pipeline should be able to detect co-eluting peptides and multiple conformations from the addition of the ion mobility dimension

LC-IMS TOF Instrument

LC-IMS-MS Features

Align and Peak Match

• Detected conformations should resemble a Gaussian distribution – Limited data points in raw data give the need to interpolate points of the detected conformation to build the most accurate profile

LC-IMS-TOF MS platform3 • High pressure converging hourglass ion funnel focuses and traps ions prior to ion injection • 1-meter IMS drift cell • Orthogonal Agilent TOF MS provides high mass measurement accuracy after IMS separation

Raw Data (UIMF)

Deisotope (DeconTools)

LC-IMS-MS Feature Finder

MS Features (Isos)

Peptide Identifications

• Data acquired through Multiplexed Ion Mobility Timeof-Flight Mass Spectrometry4

For more information, see: http://omics.pnnl.gov/

Detected peaks from smoothed data 1

d

0.9

— Unsmoothed data

0.8

— Detected peak 1

0.7

— Detected peak 2

Reproducibility

Report Drift Time

Calculate Conformation Score

Detect Peaks

Conclusions • Multiple conformations and co-eluting peptides are often observed in the IMS dimension

Actual peak data compared to theoretical peak

DS 1

2250000

DS 3

2000000

Conformation score = 0.8311

DS 4

1750000

DS 5

1 1500000

0.6

0.4 0.3 0.2 0.1 0 25.3

25.8

26.3

26.7

27.2

27.7

28.2

Drift time (ms)

Results

DS 8

1000000

0.7

DS 9 DS 10

750000

0.6

500000

0.5

250000 0

0.4

24.4

0.3

24.9 25.4

0.2

• Raw data is first smoothed using a Gaussian Kernel smoother • Peaks are detected from smoothed data using a simple 3-point peak picking algorithm

DS 7

1250000 ensity

Theoretical — Actual PeakPeak

0.8

d

DS 6

Actual PeakPeak — Theoretical

0.9

0.5

• Software is integrated into existing AMT tag pipeline – Run time of software is less than 10 min, which allows it to keep up with the highthroughput instrumentation – Robustness of software allows for software users to account for future IMS-TOF instrument updates

DS 2

25.9 Drift Time (ms)

Drift time (ms)

0.1

• Scoring function of a single conformation can be used by downstream analysis tools as a confidence measure

26.3 26.8

0

27.3

25.3

25.5

25.7

25.9

26.1

26.3

26.5

26.7

26.9

27.1

27.8

Drift time (ms)

28.2

• Theoretical peak is given a standard deviation equal to the expected resolution of the IMS-TOF instrument n 1 – ∑i=1 |actuali – theoreticali| • Conformation score = n

• Ion mobility drift time profile is seen as repeatable across multiple analyses of the same sample type

• IMS Profile is reproducible across technical replicates • Multiple conformations are also reproducible

• Peptides can be identified by using mass, elution time, charge state, and drift time reported by software

where i = one of n (1000) interpolated points

Distribution of multiple conformations

Conformation score

This work was funded by NIH NCRR ARRA Supplement Implementation of a Next Generation Proteomics Capability and U.S. Department of Energy Biological and Environmental Research (DOE/BER). Samples were analyzed using capabilities developed under the support of the NIH National Center for Research Resources (RR18522) and DOE/BER. Significant portions of the work were performed in the Environmental Molecular Science Laboratory, a DOE/BER national scientific user facility at Pacific Northwest National Laboratory (PNNL) in Richland, Washington. PNNL is operated for the DOE by Battelle under contract DE-AC05-76RLO-1830.

500

600

14.00%

Acknowledgements

Drift time accuracy 450

12.00%

500

400

10.00%

8.00%

6.00%

4.00%

350 400

Count

• Integrated into PNNL’s accurate mass and time (AMT) tag pipeline1

• Accepts DeconTools2 output (Isos file) as input

Smooth Raw Data Points

Standardized intensity

• Software tool developed in C# .NET 4.0 uses Microsoft .NET’s Parallel Extensions Library for task parallelizing and multithreading

• Microsoft .NET’s Parallel Extensions Library – Data is partitioned by mass and charge at multiple points by the algorithm – Partitioned data are processed simultaneously across multiple processors to speed up runtime

Pull Out Raw Data Points

# of occurrences

• Software tool for discovery and characterization of possible peptide signatures in LC-IMS-TOF MS – Characterize a feature by mass, elution time, drift time, and charge state – Detect multiple conformations of a peptide – Detect LC co-eluting peptides by leveraging data in IMS dimension – Provide confidence scoring for detected conformations

Software development

Standardized intensity

Overview

% of features

Methods

Conformation Detection

300

300 250 200

References

200

150 100 100

2.00%

50 0

0.00% 1

2

3

4

Charge state

• Multiple conformations are seen the most often in 3+ features • Multiple conformations are rarely seen in 1+ features • On average, about 10% of detected features contain multiple conformations

0 0.3

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Conformation score

Distribution of conformation scores exhibit a normal distribution skewed towards the end of higher scores

-0.3

-0.25

-0.2

-0.15

-0.1

-0.05

0

0.05

0.1

0.15

0.2

0.25

0.3

Drift time error (ms)

• Drift time error calculated by matching features together across multiple datasets of technical replicates using MultiAlign • Errors are only considered for features seen in all (10) datasets • Since the features are highly reproducible, they should give the best possible error in drift time • Drift time accuracy of these highly confident features seen to be about 0.1 ms

1. Kiebel, et al. PRISM: A data management system for high-throughput proteomics. Proteomics 6: 1783-1790 (2006). 2. N Jaitly, et al. Decon2LS: An Open-source Software Package for Automated Processing and Visualization of High Resolution Mass Spectrometry Data, BMC Bioinformatics, 10:87 (2009). 3. K Tang, et al. High-Sensitivity Ion Mobility Spectrometry/Mass Spectrometry Using Electrodynamic Ion Funnel Interfaces. Anal. Chem. 77:3330-3339 (2005). 4. ME Belov, et al. Dynamically Multiplexed Ion Mobility Time-of-Flight Mass Spectrometry. Anal. Chem. 80: 5873-5883 (2008).

CONTACT: Kevin Crowell Biological Sciences Division, K8-98 Pacific Northwest National Laboratory P.O. Box 999, Richland, WA 99352 E-mail: [email protected]