A critical assessment of the performance criteria ... - Wiley Online Library

46 downloads 0 Views 8MB Size Report
of rapid technical advances in analytical instruments and new approaches applied in .... time and relative ion abundances (ion ratio) for confirmatory analysis.
Drug Testing and Analysis

Research article Received: 19 August 2015

Revised: 3 December 2015

Accepted: 7 April 2016

Published online in Wiley Online Library

(www.drugtestinganalysis.com) DOI 10.1002/dta.2021

A critical assessment of the performance criteria in confirmatory analysis for veterinary drug residue analysis using mass spectrometric detection in selected reaction monitoring mode Bjorn J.A. Berendsen,a* Thijs Meijer,a Robin Wegh,a Hans G.J. Mol,a Wesley G. Smyth,b S. Armstrong Hewitt,b Leen van Ginkela and Michel W.F. Nielena,c Besides the identification point system to assure adequate set-up of instrumentation, European Commission Decision 2002/657/ EC includes performance criteria regarding relative ion abundances in mass spectrometry and chromatographic retention time. In confirmatory analysis, the relative abundance of two product ions, acquired in selected reaction monitoring mode, the ion ratio should be within certain ranges for confirmation of the identity of a substance. The acceptable tolerance of the ion ratio varies with the relative abundance of the two product ions and for retention time, CD 2002/657/EC allows a tolerance of 5%. Because of rapid technical advances in analytical instruments and new approaches applied in the field of contaminant testing in food products (multi-compound and multi-class methods) a critical assessment of these criteria is justified. In this study a large number of representative, though challenging sample extracts were prepared, including muscle, urine, milk and liver, spiked with 100 registered and banned veterinary drugs at levels ranging from 0.5 to 100 μg/kg. These extracts were analysed using SRM mode using different chromatographic conditions and mass spectrometers from different vendors. In the initial study, robust data was collected using four different instrumental set-ups. Based on a unique and highly relevant data set, consisting of over 39 000 data points, the ion ratio and retention time criteria for applicability in confirmatory analysis were assessed. The outcomes were verified based on a collaborative trial including laboratories from all over the world. It was concluded that the ion ratio deviation is not related to the value of the ion ratio, but rather to the intensity of the lowest product ion. Therefore a fixed ion ratio deviation tolerance of 50% (relative) is proposed, which also is applicable for compounds present at sub-ppb levels or having poor ionisation efficiency. Furthermore, it was observed that retention time shifts, when using gradient elution, as is common practice nowadays, are mainly observed for early eluting compounds. Therefore a maximum retention time deviation of 0.2 min (absolute) is proposed. These findings should serve as input for discussions on the revision of currently applied criteria and the establishment of a new, globally accepted, criterion document for confirmatory analysis. Copyright © 2016 John Wiley & Sons, Ltd. Additional supporting information may be found in the online version of this article at the publisher’s web site. Keywords: performance criteria; mass spectrometry; liquid chromatography; ion ratio; confirmatory analysis

Introduction

Drug Test. Analysis 2016, 8, 477–490

Collaborators: Terry Dutko (USDA, Food Safety and Inspection Service, Midwestern Laboratory, St. Louis, MO, USA), Nathalie Gillard (CER Groupe, Marloie, Belgium), Nadezhda Stoilova (CLVCE, Sofia, Bulgaria), Andréa Melo Garcia de Oliveira (LRM, LANAGRO-MG, Pedro Leopoldo, MG, Brazil), Steven Lehotay (USDA, Agricultural Research Service, Eastern Regional Research Center, Wyndmoor, PA, USA), Nélio Fleury Filho (LANAGRO-GO, Goiânia, GO, Brazil) and Perry Martos (University of Guelph, Laboratory Services Division, Agriculture and Food Laboratory, Chemistry Method Development, Guelph, ON, Canada). Travel by the Lead Author to present this manuscript at SASKVAL III was funded by the OECD. a RIKILT, Wageningen UR, Akkermaalsbos 2, 6708WB, P.O. Box 230, 6700AE Wageningen, the Netherlands b Agri-Food and Biosciences Institute, Stoney Road, Stormont, Belfast BT4 3SD. Stormont, Northern Ireland, UK c Laboratory of Organic Chemistry, Wageningen University, Dreijenplein 8, 6703 HB Wageningen, the Netherlands

Copyright © 2016 John Wiley & Sons, Ltd.

477

It is of utmost importance to ensure that decisions on the approval or rejection of food products are based on sound analytical science and evidence. The reliability of an analytical result obtained during residue testing depends on the type of methodology used and the performance criteria applied. These conditions come together in the statement that the method used was ‘fit for purpose and adequately validated’. For confirmatory methods, the phrase ‘fit for purpose’ includes that the reliability of the confirmation of the identity is beyond reasonable doubt. For this purpose European Commission Decision (CD) 2002/657/EC 1 introduced the concept of identification points. In residue analysis, mass spectrometry (MS), either in combination with liquid chromatography (LC) or gas chromatography (GC), was assigned as the main technique for confirmation of the identity of banned and regulated substances.[1] Besides statements about techniques and methods being fit for purpose and adequately validated, confirmatory analysis involves

* Correspondence to: Bjorn Berendsen, RIKILT Wageningen UR, Akkermaalsbos 2, 6708WB Wageningen, the Netherlands. E-mail: [email protected]

Drug Testing and Analysis

B. J. A. Berendsen et al.

the comparison of several physico-chemical properties, such as the chromatographic retention time (RT) and (selected parts of) the mass spectrometric (MS) fragmentation pattern of the detected compound with a reference compound. Selectivity, which is defined as ‘the power of discrimination between the analyte and closely related substances…’,[1] is the main parameter of interest in the confirmation of the identity of a compound. Criteria for confirmatory methods in the analysis of food contaminants and residues,[1–3] sports doping,[4–8] and forensic sciences,[9] are laid down in several regulations. Recent overviews of the application of identification criteria in these fields can be found elsewhere.[10,11] Recently the Codex Committee on Residues of Veterinary Drugs in Food (CCRVDF) included performance criteria in annex C of CAC/GL 71-2009,[12] primarily based upon the EU guidelines.[1] CD 2002/657/EC[1] includes performance criteria regarding retention time and relative ion abundances (ion ratio) for confirmatory analysis. In confirmatory analysis, the relative abundance of two product ions, acquired in selected reaction monitoring (SRM) mode, the ion ratio, should be within certain ranges for confirmation of the identity of a substance. Even though these criteria for veterinary drug residue analysis[1] were established based on experts’ judgement and not evidence-based on real data, they were found to be applicable in confirmatory analysis: recent systematic studies provided scientific evidence that these guidelines indeed prevent the occurrence of false positive results[13] even though some exceptions have been reported.[14] Since the publication of CD 2002/657/EC,[1] now more than a decade ago, technology available to the residue testing laboratory has advanced significantly with the development of fast scanning and sensitive MS instruments that can be combined with orthogonal separation techniques like ultra-high performance liquid chromatography (UHPLC).[15] These technological innovations have triggered the development of new analytical multi-residue and even multi-class screening and confirmatory methods that include simplified sample preparation procedures.[16] Lehotay et al.[11] emphasize that human knowledge, intelligence, and common sense should be a basis for confirmation of the identity rather than fixed criteria. These factors can add to selectivity, but in our experience, fixed guidelines, based on established robust data, are mandatory in case of a lawsuit. In this study, we aimed to assess existing criteria in the light of currently applied methodologies and to develop new evidencebased criteria applicable to modern and emerging analytical methods applied in the field of veterinary drug residue testing. A similar assessment was carried out previously in the field of pesticide residue testing[10] on the basis of which guidelines for confirmatory analysis in that field were revised.[4] Datasets were constructed from the analysis of in-house prepared homogeneous materials using relevant and state-of-the-art (front-end) analytical instruments, combining chromatographic separation and mass spectrometric detection techniques. These datasets provided the basis for the proposed new/amended criteria. Then the amended criteria were validated by a collaborative study employing in-house prepared homogeneous unknown test materials in collaboration with residue testing laboratories from all over the world, to ensure validity of the proposed criteria for confirmatory analysis.

Experimental Reagents

478

HPLC grade methanol (MeOH), acetonitrile (ACN), 2-propanol, acetone and ethyl acetate were obtained from Biosolve (Valkenswaard,

wileyonlinelibrary.com/journal/dta

the Netherlands). Ammonium formate, sodium acetate, acetic acid, formic acid, sodium chloride, 25 % ammonia, sodium hydroxide and ß-glucuronidase/arylsulfatase from helix pomatia were obtained from Merck (Whitehouse Station, NJ, USA). Milli-Q water was prepared using a Milli-Q system at a resistivity of at least 18.2 MΩ/cm (Millipore, Billerica, MA, USA). McIlvain-EDTA buffer was prepared by dissolving 74.4 g disodium ethylenediaminetetraacetic acid (EDTA, VWR International, Darmstadt, Germany) in 500 mL 0.1 M citric acid (VWR) and 280 mL 0.2 M phosphate buffer (VWR). The pH was adjusted to 4.0 by adding 0.1 M citric acid and the total volume was made up to 2 L. All reference standards were >95 % purity or of the highest quality available and were obtained from Sigma Aldrich, Toronto Research Chemicals (Toronto, ON, Canada), Steraloids (Newport, RI, USA), or RIKILT Wageningen UR (Wageningen, the Netherlands). Initial study Sample extracts

For the initial study, five types of sample extracts were prepared. Porcine muscle and bovine urine extracts were prepared both by solvent extraction only and following an additional solid-phase extraction (SPE) clean-up procedure. Milk extracts were prepared by SPE clean-up after protein removal. Each type of extract was prepared in 6-fold using materials originating from different animals, which were pooled afterwards. The final extracts were split into three separate aliquots and the model compounds were then added to obtain extracts containing the equivalent of 0.5, 2.5, and 5.0 μg/kg per sample for the banned compounds and 10, 50, and 100 μg/kg for the registered compounds. Additionally, standards in solvent were prepared at the same levels. The muscle extracts cleaned by solvent extraction only are referred to as matrix A, the muscle extracts after SPE as B, urine using solvent extraction as C, urine after SPE as D, milk extracts as E and the standards in solvent as F. Muscle and urine, solvent extraction (A and C)

An aliquot of porcine muscle (2 g) or bovine urine (3 mL) was transferred into a polypropylene (PP) centrifuge tube. EDTA solution (0.5 mL, 0.2 M) and 5 mL 2-propanol were added. After shaking, 0.5 g NaCl was added to support phase separation and the tube was shaken vigorously before placing into a rotary tumbler for 15 min. After centrifugation (3500 g, 10 min) the upper layer (2propanol) was transferred into a clean PP tube. The solvent was evaporated (40 °C, N2) and the residue was reconstituted: urine samples in 500 μL water and muscle extracts in 50 μL MeOH, the latter followed by dilution with 450 μL water to obtain clear extracts. The final extracts contained 4 g/mL muscle or 6 mL/mL urine matrix. Muscle and milk, SPE (B and E)

An aliquot of porcine muscle (2 g) or bovine milk (2 mL) was transferred into a PP centrifuge tube. Acetonitrile (6 mL) and 4 mL water were added. After shaking by hand the sample was placed into a rotary tumbler for 30 min. After centrifugation (3500 g, 10 min) 3 mL of the extract was diluted with 57 mL water facilitating SPE cleanup. A Phenomenex (Torrance, CA, USA) Strata-X RP 200 mg/6 mL SPE cartridge was conditioned by applying 3 mL of MeOH followed by 3 mL of water. The complete diluted extract was applied onto the SPE cartridge which was subsequently washed with 3 mL water. After drying the cartridge under vacuum for 5 min, the elution was

Copyright © 2016 John Wiley & Sons, Ltd.

Drug Test. Analysis 2016, 8, 477–490

Drug Testing and Analysis

Assessment of performance in veterinary drug residue analysis carried out with 3 mL MeOH/ACN (1:1 v/v) for muscle and 3 mL MeOH for milk. The solvent was evaporated (40 °C, N2) and the residue was reconstituted in 500 μL water. The final extracts contained 1.2 g/mL muscle or 1.2 mL/mL milk matrix.

at high level in the same order. Using this approach, analyses of samples of the same matrix are spread throughout a large timeframe, resulting in data that is more representative for routine analysis.

Urine, SPE (D)

System I

An aliquot of bovine urine (3 mL) was transferred into a PP centrifuge tube and 2 mL sodium acetate buffer (0.25 M, pH = 4.8) was added. After shaking, the pH was adjusted to pH = 4.8 by adding droplets of 1 M acetic acid. ß-glucuronidase/arylsulfatase (25 μL) was added to the extract which was then incubated for 1.5 h at 50 °C. A Waters (Manchester, UK) Oasis MCX 60 mg/3 mL SPE cartridge was conditioned by applying 3 mL of MeOH followed by 3 mL of sodium acetate buffer (0.25 M, pH = 4.8). The complete extract was applied onto the SPE cartridge which was subsequently washed with 1 mL acetic acid (1 M) and 3 mL sodium acetate buffer (0.25 M, pH = 4.8) / acetone (85:15 v/v). After drying the cartridge under vacuum for 5 min, the elution was carried out with 3 mL ammonia (3%, v/v) in ethyl acetate. The solvent was evaporated (40 °C, N2) and the residue was reconstituted in 500 μL water. The final extracts contained 6 mL/mL urine matrix.

The LC system consisted of a Waters (Milford, MA, USA) Acquity iclass UPLC using a core-shell particle Phenomenex Kinetex C18, 2.1 * 100 mm, 1.7 μm column, placed in a thermosttated column oven set at 30 °C. The gradient (solvent A, 2 mM ammonium formate and 0.016% formic acid in water; solvent B, 2 mM ammonium formate and 0.016% formic acid in methanol): 0–0.5 min, 1% B, 0.5– 5.0 min, linear increase to 100% B, with a final hold of 1 min, operating at a flow rate of 0.3 mL/min. The injection volume was 5 μL. Detection was carried out in SRM mode using an AB Sciex QTrap 6500 MS in positive electrospray ionization mode using the ion transitions as presented in the Supporting Information, S1.

Sample spiking

One hundred veterinary drugs (Supporting Information, S1) were added to all sample extracts. Four mixed solutions of veterinary drugs were prepared by combining stock solutions (either consisting of water or MeOH) of the individual compounds. Of these mixed solutions a small volume was added to the matrix extracts resulting in a final percentage of organic modifier in the aqueous extracts of 6, 10, and 14% for the low, medium, and high concentration levels respectively.

System II

The LC system consisted of a Waters (Milford, MA, USA) Acquity iclass UPLC using a Waters Acquity UPLC BEH, 2.1 * 100 mm, 1.7 μm, column, placed in a thermostated column oven set at 40 °C. The gradient (solvent A, 0.1% acetic acid in water; solvent B, 0.1% acetic acid in methanol): 0–1.0 min, 0% B, 1.0–2.5 min, linear increase to 45% B, 2.5–8.5 min, linear increase to 100% with a final hold of 1.8 min, operating at a flow rate of 0.4 mL min1. The injection volume was 5 μL. Detection was carried out in SRM mode using a Waters Xevo TQS MS in positive electrospray ionization mode using the ion transitions as presented in the Supporting Information, S1. System III

Instrumental analysis

All samples were analyzed using different HPLC and MS instrumentation (Table 1) and different conditions, including solvents and gradients. The systems, coded I–IV, include UHPLC using LC columns packed with fully porous materials (II and IV) or core-shell particles (I), and a micro-fluidics UPLC set-up (III). Samples were analyzed in the following order: all samples spiked at the lowest level; respectively extract A, B, C, D, E, and F. All samples spiked at the intermediate level and finally all samples spiked

The LC system consisted of a Waters (Milford, MA, USA) Acquity iclass UPLC using a Waters iKey BEH C18, 150 μm * 50 mm, 1.7 μm. The gradient (solvent A, 0.1% formic acid in water; solvent B, 0.1% formic acid in acetonitrile): 0–2.0 min, 1% B, 1.0–8.0 min, linear increase to 99% B, with a final hold of 2 min, operating at a flow rate of 2 μL min1. The injection volume was 2 μL. Detection was carried out in SRM mode using a Waters Xevo TQS in positive electrospray ionization mode using the ion transitions as presented in the Supporting Information, S1.

Table 1. Instruments used in the initial and collaborative study, including HPLC instrument, analytical column, and MS. INITIAL STUDY System Code

LC column

MS

I Kinetex C18, 1.7 μm II Acquity BEH, 1.7 μm III iKey BEH C18, 1,7 μm IV Agilent Eclipse plus 1.8 μm COLLABORATIVE STUDY Laboratory code LC column

Acquity Acquity Acquity Agilent Infinity 1290

Qq-LIT, AB Sciex 6500 QqQ, Waters Xevo TQS QqQ, Waters Xevo TQS QqQ, Agilent 6490

LC

MS

1 2 3 4 5 6 7

Waters Acquity BEH / HSS T3 Phenomenex Gemini C18 110 Å, 2.0mmx150 mm, 5 μm Kinetex C18, 4.6x50 mm, 2.6 μm. Poroshell - 120 EC-C18 3.0x50 mm, 2.7 μm UCT Selectra DA, 100x2.1 mm, 3 μm HSS T3 1.8 um. 2.1x100 mm Agilent Poroshell, 2.1x50 mm (2.7 μm)

Waters Xevo TQS Thermo TSQ Quantum Discovery Max Waters, Quattro Premier XE AB Sciex 5000 AB Sciex 6500 Waters TQD AB Sciex 5500

Copyright © 2016 John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/dta

Waters Acquity UPLC Thermo Finnigan Surveyor Waters, HT Alliance 2795 Agilent 1200 L Agilent 1100 Waters Acquity UPLC Shimadzu 20 AD XR (UFLC)

Drug Test. Analysis 2016, 8, 477–490

479

LC

Drug Testing and Analysis

B. J. A. Berendsen et al.

System IV

The LC consisted of an Agilent (Santa Clara, CA, USA) Infinity 1290 system using an Agilent Eclipse plus C18, 2.1 * 100 mm, 1.8 μm, column, placed in a thermostated column oven set at 50 °C. Two HPLC separations were applied to cover the analysis of all compounds. The first gradient (solvent A, 0.5 mM ammonium acetate in water/methanol (95:5 v/v); solvent B, 0.5 mM ammonium acetate in water/methanol (1:9 v/v)) was: 0–0.75 min, 10% B, 0.75–2.6 min, linear increase to 40% B, 2.6–5.0 min, linear increase to 90% B with a final hold of 1.9 min, operating at a flow rate of 0.5 mL/min. The injection volume was 0.5 μL. The second gradient (solvent A, 0.1% formic acid in water/methanol (95:5 v/v); solvent B, 0.1% formic acid in water/methanol (1:9 v/v)) was: 0–0.75 min, 10% B, 0.75–2.6 min, linear increase to 40% B, 2.6–5.0 min, linear increase to 90% B with a final hold of 1.75 min, operating at a flow rate of 0.5 mL/min. The injection volume was 0.75 μL. Detection was carried out in SRM mode using an Agilent 6490 MS in positive electrospray ionization mode. Some ion transitions differed compared to the Supporting Information, S1. For the system having acetate in the mobile phase these were: oxyphenylbutazone 325.1 > 160.0 and 204.0, naproxen 231.0 > 170.2 and 185.0, diclofenac 296.0 > 213.9 and 249.2, bromoxynil 276.1 > 78.9 and 170.9, clenbuterol 277.2 > 202.8 and 168.1, and bithionol 353.1 > 160.9 and 193.8. For the system having formic acid in the mobile phase, these were: tylvalosin 1042 > 814.1 and 109.0, erythromycin 734.2 > 158.0 and 576.0, natamycin 666.3 > 503.2 and 485.2, spiramycin 422.2 > 101.1, and tilmicosin 435.0 > 397.0 and 98.8. Data analysis

The data obtained using the different systems were evaluated separately. After automatic integration, the integration process was checked manually for each compound in each sample extract. Then a threshold for peak area was established to remove noise integration. This threshold was optimized for each instrument individually. Next, for each compound, in each sample extract, the ion ratio, the RT and, in case an isotopically labelled internal standard was available (63% of the compounds), the relative retention time (RRT) was calculated. This resulted in a maximum of 12 420 data points per UHPLC-MS system (115 compounds x 6 matrices x 6 batches x 3 concentration levels). For each compound, the average of all ion ratios within a single type of extract was used as the reference ion ratio. Based on this, the ion ratio deviation (expressed as a relative value) for all individual samples of the same extract type were calculated. The same was done for RT, but these deviations were expressed both as absolute and relative values. Subsequently, the deviation of the ion ratio was studied by plotting these against the ion ratio itself and the peak area of the least abundant product ion (also called the qualifier). Then, it was determined what percentage of the results complied with an ion ratio criterion of 20, 25, 30, 40, and 50%. Furthermore, the absolute and relative deviations of the RT were plotted against the RT.

matrix Z. For each type of matrix, 12 mL of extract was prepared in quadruplicate using material originating from different calves/cows. Each of the four extracts of the three matrices was fortified to produce one reference extract, containing all banned compounds at the equivalent of 3 μg/kg in sample and registered compounds at 50 μg/kg, and three unknown samples containing a random selection of 18 up to 47 compounds (Table 2) at levels ranging from 2 to 200 μg/kg. Because steroidal compounds were found to be most challenging in the initial study, some additional steroidal compounds were added to the collaborative trial. Furthermore, additional compounds having low m/z values (e.g. nitroimidazoles) were included. Each of the four final extracts (reference and three unknown) was split into twenty 600 μL aliquots. Twenty sample sets were prepared, each containing 12 sample extracts (four of each matrix), a low standard in solvent at 0.03 and 0.5 μg/mL (banned and registered compounds, respectively), and a high standard in solvent at 0.3 and 5 μg/mL (banned and registered compounds respectively) to be used for instrument optimization. Liver and muscle, QuEChERS (X and Y) in collaborative study

Fifty-two aliquots of both porcine muscle and bovine liver (10 g) were transferred into polypropylene (PP) centrifuge tubes and extracted using 10 mL acetonitrile. MgSO4 (4 g) and NaCl (1 g) were added to induce phase separation. After centrifugation (3500 g, 10 min) the supernatants were decanted into 50 mL PP tubes containing 1.5 g MgSO4 and 0.5 g C18. After shaking, the extracts were again centrifuged (3500 g, 10 min). The cleaned supernatants were decanted into 14 mL PP tubes and evaporated (50 °C, N2). The residues were dissolved in 100 μL methanol and 800 μL water. Thirteen extracts of the same matrix were combined to produce four batches. Standard solutions were then added to these batches to obtain the spiked sample extract as presented in Table 2. The final extracts contained 10 g/mL muscle or liver matrix. Urine, SPE (Z)

Fifty-two aliquots of bovine urine (10 mL) were transferred into a 50 mL PP centrifuge tube and 50 μL ß-glucuronidase/arylsulfatase was added. The samples were then incubated overnight at 37 °C. McIlvain buffer containing 0.1 M EDTA (10 mL) was added and after centrifugation the supernatant was transferred onto a conditioned Phenomenex Strata-X RP SPE cartridge (200 mg). After washing with 3 mL of water the cartridges were eluted by 5 mL methanol. The solvent was evaporated (50 °C, N2) and subsequently the residues were dissolved in 100 μL methanol and 800 μL water. Thirteen extracts were combined to produce four batches. Standard solutions were added to these batches to obtain the spiked sample extract as presented in Table 2. The final extracts contained 10 mL/mL urine matrix. Instrumental analysis in collaborative study

Collaborative study Sample extracts in collaborative study

480

For the collaborative study, three types of sample extracts were prepared: liver and muscle using a QuEChERS (Quick, Easy, Cheap, Effective, Rugged, and Safe) sample clean-up procedure and urine using an SPE clean-up. The liver extracts are further referred to as matrix X, the muscle extracts as matrix Y and the urine extracts as

wileyonlinelibrary.com/journal/dta

Seven out of ten laboratories that volunteered to participate in the collaborative study to verify the ion ratio and RT criteria reported results. They all used their own instrumentation and settings, thus the study covered a broad range of (U)HPLC conditions, analytical columns, and MS-instrumentation (Table 1). For each type of extract it was directed that all sample extracts should be injected only once and the reference extracts, containing all compounds, twice: before and after the unknown sample extracts.

Copyright © 2016 John Wiley & Sons, Ltd.

Drug Test. Analysis 2016, 8, 477–490

Drug Testing and Analysis

Assessment of performance in veterinary drug residue analysis Table 2. Extract composition in the collaborative study samples. -1

Compound

Samples (μg kg ) Reference Liver A Liver B Liver C Muscle A Muscle B Muscle C Urine A Urine B Urine C

Anthelmin-tics

Corticosteroids

Macrolides

Nitroimidazoles

NSAIDs

Quinolones

ß-agonists

Levamisole Thiabendazole Hydroxythiabendazole Morantel Pyrantel Prednisone Prednisolone Isoflupredone Cortisone Flumethasone Beclomethasone Clobetasol Gamithromycin Josamycin Lincomycin Pirlimycine Tiamulin Tulathromycin Tylosine Valnemulin Ronidazole Dimetridazole Metronidazole Hydroxymetronidazole Ipronidazole Hydroxyipronidazole Carprofen Diclofenac Fenbufen Firocoxib Flunixin Indoprofen Ketoprofen Meclofenamic acid Mefenamic acid Meloxicam Naproxen Niflumic acid Piroxicam Propyphenazone Tolfenamic acid Tolmetin Ciprofloxacine Difloxacine Enrofloxacine Flumequine Marbofloxacin Nalidixic acid Norfloxacine Oxolinic acid Sarafloxacine Bromchlorbuterol Broombuterol Cimaterol Clenbuterol

50 50 50 50 50 3 3 3 3 3 3 3 50 50 50 50 50 50 50 50 3 3 3 3 3 3 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 3 3 3 3

100 50

30

50 100

75

50 50 2 5 3

3 2 30

3 2 10 3 2

5 150 100

40 90 40

100 70 40

2

4 5 3

5 7 3 50 40

40 80

3

50

50 50

3

100

40

50

40 50 50

100 3 7 3 5

6 10

80 60

40

50

75

50 100

75

40 40

90 50 40 80

3

30

80

50 50

60 50 50

50

40 40 50

40

100 40

75 50

0.4 50

40

50 50 40

50 50

80

40

40

40 40 40

60 4 9 10

60 50 40

50 40

40 70 50

5 3 9

8

50

40 75 80 80 50 50

40

2

2

90

3

2 6 6

120 50 5

50

50 50 2

3

150 40

40

100

40 50

10

3

5

80

50 100

2

3 6 5

5

3 3

5

40

5 6

Drug Test. Analysis 2016, 8, 477–490

Copyright © 2016 John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/dta

481

(Continues)

Drug Testing and Analysis

B. J. A. Berendsen et al.

Table 2. (Continued) -1

Samples (μg kg )

Compound

Reference Liver A Liver B Liver C Muscle A Muscle B Muscle C Urine A Urine B Urine C Clencyclohexerol Clenhexyl Clenpenterol Fenoterol Mabuterol Procaterol Ractopamine Ritodrine Salbutamol Steroids Sulfonamides 17a-Methyltestosterone 17B-Nortestosterone 17a-Nortestosterone 17B-Trenbolone 17a-Trenbolone Methylboldenone 1,4-androstadiene-3,17-dione (ADD) Clostebol Stanozolol 16B-hydroxystanozolol 4-chloro-4-androst-3,17-dione (CLAD) Dapsone Sulfachloropyridazine Sulfadiazine Sulfadimethoxine Sulfadimidine Sulfadoxine Sulfamerazine Sulfamethizole Sulfamethoxazole Sulfapyridine Sulfaquinoxaline Sulfathiazole Sulfisoxazole Tetra-cyclines Chloortetracycline Doxycycline Oxytetracyclin Tetracycline Tranquilizers Xylazine Carazolol Azaperone Azaperol Haloperidol Chlorpromazine Acetopromazine Propionylpropazine

3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 50 3 3 3 3 3 3 3 3

Data analysis in collaborative study

482

On the basis of the reference sample injections, the compounds showing inadequate performance (one or both ion transitions not sufficiently detected) were removed from the dataset. For each detected compound, the average ion ratio and RT obtained from the two injections of the reference sample extract was considered as the reference value. The ion ratio and RT of the compounds present in the unknown samples were compared with these reference values to obtain the ion ratio and RT deviation. Data from

wileyonlinelibrary.com/journal/dta

8

3 12 6 2.5 5

10 3

4 8 5 3

2.5 3

4

3 6

3

6 5

10

10 5

3 10 5 5

5 6

4

3 4 5

10 3

5 10 3

90 50

150

40

120

3 6 8

8

5

120

70

50

50

10

50 100

100

100 100

60

60 120

150 40 100

140

160

60

50 80 70

120

40 100 40

8 9

50 50

50 120 100 150

130

140 90 4 3 5

40 10 5 9

80

120

100 100

60 100

40

80 5 10 9

3 8 8 4

3 4 11

90 60 80

4 9

50 100

4 3

9 8

laboratories that scored a high false negative rate (>30%) due to high ion ratio deviations were investigated in more detail. From the data, for each unknown sample, the false negative and false positive rates were calculated when setting different confirmatory criteria. The false negative rate was calculated by: FN ¼ nm =np x 100%

(1)

in which FN is the false negative rate, nm the number of compounds

Copyright © 2016 John Wiley & Sons, Ltd.

Drug Test. Analysis 2016, 8, 477–490

Drug Testing and Analysis

Assessment of performance in veterinary drug residue analysis missed (not detected or of which the identity was not confirmed), and np the number of compounds truly present in the sample. Besides calculating the overall false negative rate, it was determined if a false negative finding was due to the compound not being detected, due to exceeding the ion ratio criterion or due to exceeding the RT tolerance. The false positive rate was calculated by:  FP ¼ nd = n-np x 100%

(2)

in which FP is the false positive rate, nd the number of compounds detected (and of which the identity was confirmed) but were not present in the sample, n the total number of compounds in the study, and np the number of compounds truly present in the sample. Only if the peak area of the most abundant diagnostic ion of a falsely detected compound was above 1% of the signal of the corresponding compound in the reference extract, was it considered to be a true false positive. Hereby, unjustified assignments of false positives due to carry-over effects were excluded. Such effects were especially prominent for quinolone antibiotics.

Results and discussion Representative chromatograms from the study, to illustrate the observations discussed further on in this section, are presented in Figure 1. Figure 1a shows the ion transitions of niflumic acid in a muscle extract cleaned with solvent extraction only (A). It is observed that the detector is saturated in System II resulting in peak cut-off and that sub-optimal peak shape is obtained using microLC due to its relatively high injection volume in relation to the organic fraction in the injected extract (System III). Figure 1b shows the ion transitions for methylamino antipyrine (MAA) in a urine extract after SPE clean-up (D). Here it is demonstrated that the most polar compounds do not have sufficient retention in System III. This is likely caused by the use of a relatively large injection volume compared to the column dimensions of the iKey and the composition of the injected extract, which contains more organic solvent compared to the starting composition of the mobile phase. Figure 1c shows the ion transitions of mabuterol in a representative milk extract (E). Here it is noted that micro-LC results in the best signal to noise ratio and clean chromatograms.

Retention time

Drug Test. Analysis 2016, 8, 477–490

The results of the absolute RT deviation of Systems I through IV are presented in Figure 2. For each system the number of data points included in the evaluation is shown, including all compounds in all matrices, all batches and at all three concentration levels showing sufficient signal intensity above the established threshold. This is different for each instrument, because detectability (combination of sensitivity, matrix interference and selectivity) is different due to the use of different instrumentation, LC-conditions, injection volumes, etc. Some deviating results were observed. For instance, using System I, erythromycin in one of the urine extract (at all three levels) showed an RT approximately 0.2 min below the average, whereas another batch of urine resulted in an RT well below the average. As discussed, urine was considered the most challenging matrix, which is supported by these observations. Also for System II, erythromycin shows some deviating results; again these are related to the urine extracts. For System III the most polar compounds are not retained as a result of a relatively large injection volume compared to the column dimensions and the composition of the injected extract. These sub-optimal conditions most likely also caused the larger variation in the retention times. In most legislation on performance criteria for confirmatory analysis, a relation is suggested between the RT deviation and the RT itself: the criterion for RT is expressed as a relative value and thus a larger absolute RT difference is allowed for late eluting compounds compared to early eluting ones.[1,2,5–9] On a theoretical basis, when isocratic elution is applied, this is indeed expected. However, nowadays, gradient elution is common practice and therefore, this was applied in the current study. As can be observed from Figure 2, a relation as suggested by 2002/657/EC1 is not apparent when applying gradient elution. On the contrary, high RT deviations are mainly observed for early eluting compounds, indicating that an absolute criterion for RT deviation is more realistic when using gradient elution. Because the use of gradient elution is common practice today, revision of the RT criterion is justified. For each system it was calculated what percentage of data points complied with an RT tolerance of 5%, as is currently applied, and with an absolute criterion of 0.1 and 0.2 minutes. The results are presented in Table 3. No clear differences in RT deviations are observed between fully porous particles (Systems II and IV) and core shell particles (System I). Systems III and IV show some RT deviations above the currently established RT criterion of 5%; these were all related to early eluting compounds. If an absolute RT tolerance is applied of 0.1 min, all systems show performance well above 95% compliance, except for System III. Note that, in System III, some compounds did not show sufficient retention and eluted in or just after the void volume (Figure 1b) which can be attributed to the large injection volume relative to the column dimensions. If an RT tolerance of 0.2 min is used, almost all compounds showed compliance, even in System III. Because the latter are results from suboptimal HPLC analysis, an RT tolerance of 0.1 min was used as the target in the collaborative study. Collaborative study Seven out of ten laboratories reported results. With the inclusion of data from seven different laboratories, located all over the world, there is a wide variation in equipment and settings used in the collaborative study and included various major instrument brands available on the market today (Table 1). The data are presented in Figure 3. The data shows similar RT deviations throughout the

Copyright © 2016 John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/dta

483

Most legislation on performance characteristics express the RT criterion as a relative value.[1,2,5–9] Some of these indicate both a relative and an absolute criterion of which the largest is to be used in practice.[5,7,8] In these cases the absolute RT criterion ranges from 3 s[7] to 24 s.[5] In the latest revision of the pesticides guidelines, SANCO/12571/2013,[4] the RT criterion was revised from 2.5% relative to 0.2 min absolute based on a large dataset. Some publications are available showing the variation of the RT for veterinary drugs using LC analysis in practice. Dasenaki et al.[17] showed that in a validation of veterinary drugs and pharmaceuticals in fish and milk using UHPLC with gradient elution, the RT deviation is below 0.2 min for all veterinary drugs included, of which the vast majority is well below 0.1 min. Wei et al.[18] presented a multi-class method showing RT deviations to be below 0.1 min using regular HPLC with gradient elution.

Initial study

Drug Testing and Analysis

B. J. A. Berendsen et al.

484

Figure 1. Representative chromatogram of two ion transitions of (a) niflumic acid in muscle extract cleaned with solvent extraction only at medium concentration level and (b) methylamino antipyrine in urine extract cleaned with SPE at high concentration level and (c) mabuterol in milk extract at low concentration level for Systems I–IV with reference to Table 1.

wileyonlinelibrary.com/journal/dta

Copyright © 2016 John Wiley & Sons, Ltd.

Drug Test. Analysis 2016, 8, 477–490

Drug Testing and Analysis

Assessment of performance in veterinary drug residue analysis

Figure 2. Absolute retention time deviations for all compounds in all matrices in all batches and at three different concentration levels as observed for System I (n = 9695), System II (n = 8403), System III (n = 9280), and System IV (n = 11643), with reference to Table 2. Table 3. Percentage of data points that comply with a certain retention time and relative retention time tolerance. System

I II III IV

Retention time tolerance

Relative retention time tolerance

5%

0.2 min

0.1 min

2.5%

2.0%

1.0%

99.9 100 98.6 98.5

100 99.9 93.4 98.1

100 99.9 92.3 97.7

99.9 99.8 87.1 95.7

100 100 99.3 99.9

99.9 99.9 94.8 99.6

Drug Test. Analysis 2016, 8, 477–490

Figure 3. Absolute retention time deviations versus the retention time for all compounds in all matrices reported by all participants in the collaborative study (n = 1452).

doubled. Even though the RT criterion only has a minor effect on the prevention of false positive results, it is suggested to apply an RT criterion in confirmatory analysis in which the overall false negative rate < 1%. This yields an absolute RT tolerance of 0.2 min, which is somewhat larger than deduced from the initial study. This is explained by the larger variation of instruments and conditions in the collaborative study. Note that a criterion of ±0.2 min does match with previously published results on RT

Copyright © 2016 John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/dta

485

whole RT range and RT deviations are somewhat larger compared to the initial study (Figure 2). The evaluation in terms of false positive and false negative findings as a result of the RT criterion in the collaborative study, using different RT tolerances, is presented in Table 4. It is observed that the false positive rate only decreases a little by applying a more stringent RT tolerance. Note that some participants used SRM time windows during acquisition and that usually during the integration process a time window (usually 0.5–1 min) is used for peak detection. Therefore, the results related to applying ‘no RT criterion’ with regard to an RT tolerance equal to the data acquisition window or the integration window width applied by the participants. This results in an underestimation of the true false positive rate for omitting an RT criterion. The false negative rate clearly decreases with a less stringent RT deviation tolerance. When applying an RT tolerance of 5%, as is currently established in 2002/657/EC,[1] the false negative rate almost equals the use of an absolute criterion of 0.2 min. Using an RT tolerance of 0.1 min, the false negative rate is at least

Drug Testing and Analysis

B. J. A. Berendsen et al.

Table 4. False positives and false negative findings due to RT deviations in the collaborative study using different RT deviation criteria. RT False positives (%) False negatives (%) deviation criterion Liver Muscle Urine Overall Liver Muscle Urine Overall 5% 0.2 min 0.1 min nonea

7.6 7.1 6.0 8.2

6.6 6.2 5.3 7.6

5.9 7.3 6.0 6.3

6.6 6.9 5.8 7.3

0.5 0.3 0.8 0

0.8 0.5 1.3 0

3.2 2.1 5.7 0

1.3 0.9 2.3 0

a

The retention time criterion is equal to the applied data acquisition and integration window, which is different for each of the participating laboratories.

reproducibility[17,18] and is in line with the revised pesticide guidelines.[4]

Relative retention time versus an internal standard Initial study The results of the RRT deviation of all systems included are presented in Figure 4. Note that when isotopically labelled internal standards were used as was the case for 70% of the compounds, the calculated RRT should always be around 1.0, and thus the relative and absolute RRT deviation are approximately equal. Only when non-isotopically labelled internal standards are used, will an RRT well below or above 1.0 be observed. From Figure 4 it is observed that, as for the RT, deviations in RRT are especially observed for early eluting compounds.

For each system it was calculated what percentage of data points complied with a relative RRT tolerance of 2.5, 2 and 1%. The results are presented in Table 3. System IV and especially System III show some RRT deviations above the currently established RRT criterion of ±2.5%. If an RRT deviation tolerance is applied of 1%, all systems show performance above 95% compliance, except for System III. Note that, in System III, some compounds did not show sufficient retention and eluted at or just after the void volume. If an RRT tolerance of 2% is used, compliance severely increases. Only a minor further improvement is observed if a tolerance of 2.5% is applied. Because the results of System III are considered to be obtained from sub-optimal HPLC analysis, an RRT tolerance of 2% could be proposed as a suitable criterion. However, for simplification of the regulations, it is advised to discard the criterion for relative retention time as currently applied, chromatographic separations are highly reproducible and robust. It is suggested that a single criterion for RT suffices. No internal standards were included in the collaborative study and therefore, no data on the RRT is available from this. Ion ratio Some legislation on performance characteristics indicate a fixed criterion for the ion ratio deviation,[2,6–9] of which some include both an absolute and a relative criterion, applying whichever is largest.[6,8] In others, among which is CD 2002/657/EC,[1] the tolerance of the ion ratio depends on the value of the reference ion ratio.[1,4,5] These criteria range from 10%[2] relative up to 50% relative.[1,2] Limited studies have been published on the variation in the ion ratio observed when using multiclass or multi-compound residue methods. Schneider et al.[19] developed a multi-residue

486

Figure 4. Relative deviation of the relative retention time versus the relative retention time of System I (9695), System II (n = 8403), System III (n = 9280), and System IV (n = 11643), with reference to Table 2. Note the different scale for System III.

wileyonlinelibrary.com/journal/dta

Copyright © 2016 John Wiley & Sons, Ltd.

Drug Test. Analysis 2016, 8, 477–490

Drug Testing and Analysis

Assessment of performance in veterinary drug residue analysis method and showed it is difficult for all compounds to comply with the current qualitative performance criteria, especially at the lower concentration range. This is in agreement with Wei et al. who reported variations in ion ratios up to 26% (relative standard deviation for a compound yielding an ion ratio of 65%) in a multi-class residue method.[18] This is an absolute ion ratio deviation of approximately 50% (95% confidence interval). Next, Mol et al.[10] demonstrated that a higher ion ratio tolerance compared to legislation reduced the false negative rate without an increase in false positive rate and, furthermore, Kaufmann et al.[20] demonstrated that product ions should originate from the same precursor ion to avoid asymmetrical signal suppression and with this, deviating ion ratios.

course this impacts the false positive rate, which is closer studied in the collaborative study described. If the obtained peak area for both product ions is sufficiently high, applying an ion ratio tolerance of 30% (independent of the ion ratio itself) almost 95% of the compounds show compliance. This is in agreement with results reported in an earlier study on pesticides in fruit and vegetables.[10] However, to prevent false negatives for compounds having a poor detectability, a less stringent criterion seems to be mandatory. Note that the study on pesticides included compounds at a concentration level of 10–200 μg/kg, whereas banned substances are present at lower level (0.5–5.0 μg/kg) in the current study. Collaborative study

Initial study

Drug Test. Analysis 2016, 8, 477–490

Two of the seven laboratories that reported results reported a high false negative rate due to high deviations of the ion ratio. The first laboratory reported 48% false negatives based on an allowed maximum ion ratio deviation of 30%. This is explained by the fact that the reference samples and unknown samples were not analyzed within a single run. A second laboratory scored 52% false negatives due to ion ratio deviations, but no reason for the high variation in ion ratios was found. Apparently, the system or settings used did not result in trustworthy and reproducible results for the ion ratio of compounds detected in the sample. Due to these unreliable results for the false negative rate, the results for false positives were also considered unreliable and thus the complete data set of both laboratories were omitted in the evaluation of the ion ratio criteria. The remaining data for the ion ratio evaluation are graphically presented in Figure 6. With the inclusion of data from five different laboratories located all over the world, the variation in equipment and settings (including instrumentation from four different vendors) is considered representative for the field. The total number of compounds detected in the extracts was 382 for the liver extracts (72% of the spiked compounds), 378 for the muscle extracts (71% of the spiked compounds) and 278 for the urine extracts (83% of the spiked compounds), making a total of 1038 compounds (75% of the total). Note that these numbers are highly biased by a single laboratory that included only 25% of the compounds in their analysis meaning that most laboratories detected over 95% of the spiked compounds correctly. From the data, similar trends as in the initial study were observed: the ion ratio depends on the peak area of the qualifier ion rather than on the ion ratio itself. The overall false positive and false negative rate was calculated for each of the individual matrices (X–Z) and, subsequently, the false negative rate as a result of a deviating ion ratio was calculated (Table 6). It is apparent from the data that the urine extracts were the most challenging: these extracts show the highest false positive and false negative rate. This is most likely caused due to severe matrix effects in these sample extracts as a result of the hydrolysis procedure. This clearly demonstrates the necessity of a fit-for-purpose sample clean-up procedure, which of course depends on the target matrix, compounds and the concentration. First, it is observed that the false positive rate increases (approximately doubles) if the ion ratio is not included as a criterion in confirmatory analysis. This demonstrates that the ion ratio is an orthogonal parameter in confirmatory analysis enhancing the power of the confirmation of the identity of a compound. Therefore, this criterion should not be omitted so to prevent false positive results. Second, the false negative rate clearly decreases with a

Copyright © 2016 John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/dta

487

The results of the ion ratio deviation are presented in Figure 5. For System II, a high number of deviations of ion ratio were observed. This was caused by saturation of the mass spectrometric detector (as a result of the analogue-to-digital convertor being at full scale), mainly for the signal of the highest product ion at the medium and high concentration level (Figure 1a). This observation only applied to System II, which is explained by the relatively limited linear dynamic range of the ion detector. As a result, the resulting chromatographic peaks are cut-off and ion ratios are different for the three concentration levels, leading to high ion ratio deviations for both the lowest and highest concentration level. A correction for this effect was carried out by using the average ion ratio at each concentration level and so calculating the ion ratio deviation within a single concentration level. The corresponding results are presented. In CD 2002/657/EC[1] a relation is suggested between the ion ratio deviation and the ion ratio itself. In that document, for ion ratios above 50% a 20% tolerance is established, whereas for lower ion ratios the tolerance is increased. From Figure 5a it is observed that such a relation is not apparent: ion ratio deviations are independent of the ion ratio itself. However, looking at Figure 5b, a clear relation is observed between the deviation of the ion ratio and the peak area of the qualifier which is in accordance with expectations. High ion ratio deviations are mainly observed in cases where the second product ion’s peak area is small. This is a direct result of the increased uncertainty in ion numbers close to the noise. For each system it was calculated what percentage of all ion ratios measured (including all detected compounds in all matrices and at all concentration levels) complied with ion ratio criteria ranging from 20 to 50% (Table 5). Note that for System II the data corrected for detector saturation were used. With the exception of System III, the systems show similar results. System III shows higher ion ratio deviations because the analytical separation on the iKey was sub-optimal and resulted in significantly lower signal to noise ratios for a large number of compounds compared to the other systems. As the ion ratio deviation is correlated to the signal of the lowest product ion, this results in higher ion ratio deviations. If the same evaluation is carried out for the banned compounds only (present at lower concentration levels), the percentage of compliance decreases. For example, for System I, the percentage of registered compounds detected that comply with an ion ratio criterion of 50% is 98.8% and for an ion ratio tolerance of 30% this is 96.3%. If only banned compounds are considered, the compliance with an ion ratio tolerance of 50% is 97.3% and for an ion ratio tolerance of 30% this is 92.6%. This clearly demonstrates that, to prevent false negative results for banned substances, usually present at sub-ppb levels, a less stringent ion ratio tolerance would be desirable. Of

Drug Testing and Analysis

B. J. A. Berendsen et al.

488

Figure 5. Ion ratio deviations for all compounds in all matrices, in all batches and at three concentration levels versus (a) the ion ratio and (b) the qualifier’s response for System I (n = 9695), System II (n = 8403), System III (n = 9280), and System IV (n = 11643), with reference to Table 1.

wileyonlinelibrary.com/journal/dta

Copyright © 2016 John Wiley & Sons, Ltd.

Drug Test. Analysis 2016, 8, 477–490

Drug Testing and Analysis

Assessment of performance in veterinary drug residue analysis

unrelated to the detectability of the compound is practical in routine applications and, therefore, a fixed ion ratio criterion of 50% is proposed. This is higher than the criterion suggested by Mol et al.,[10] which might be explained by:

Table 5. Percentage of all ion ratios measured that comply with a certain ion ratio criterion. System

I II III IV

Ion ratio criterion (%) 20

25

30

40

50

91.5 90.6 77.6 92.8

94.4 93.4 82.6 94.7

96.3 95.4 86.0 95.7

98.0 97.8 90.2 97.1

98.8 98.9 92.8 97.8

higher ion ratio deviation tolerance as expected. Surprisingly, the highest false negative rate is observed for the muscle extracts. This is mainly caused by the false detection of doxycycline, tetracycline, chlortetracycline and meloxicam, which all involve the monitoring of non-selective ion transitions. From the initial study an ion ratio tolerance of 30% seemed adequate, noting that a wider tolerance could be necessary for adequate confirmation of veterinary drugs present at sub-ppb levels, especially the compounds with low detectability (low ionization efficiency, prone to matrix influences, etc.). In contrast to the initial study, from the collaborative study it is observed that, when using a 30% ion ratio tolerance, the false negative rate as a result of this criterion exceeds 5%. This is likely caused by the higher variation in instruments and settings in the collaborative study. When applying an ion ratio criterion of 40%, the overall false negative rate drops below 5% (to 3.4%) and applying a criterion of 50%, this is 2.9%. The false positive rate, especially for the cleaner extracts, remains equal or only slightly increases when the ion ratio criterion is increased from 30 to 50%. As discussed above, especially for the lower concentration levels, an ion ratio tolerance of 50% is needed to prevent false negative results. A uniform criterion,

1 The fact that many pesticides show more selective ion transitions compared to certain veterinary drugs, which in many cases only contain carbon, hydrogen, oxygen and nitrogen atoms. 2 The fact that some veterinary drugs, for example steroids, are structurally similar to naturally occurring compounds, which may cause an increase in background noise and the probability of an interfering signal. 3 The fact that banned substances at very low concentration levels in samples were included in the current study.

Non-selective ion transitions Most false positives (41%) are observed for doxycycline, niflumic acid, chlortetracycline, methylboldenone, and prednisolone. Considering the ion transitions, 70% of the false positives observed (originating from 15 different compounds) are related to monitoring product ions of m/z 121 or neutral losses of -17 and -18 Da. These product ions and neutral losses were also indicated as being non-selective.[13] Also the product ion m/z 91 was previously indicated as non-selective.[13] Note that, in this study, ion transitions resulting in a product ion of m/z 91 were avoided for that reason. For 15 other compounds that included these non-selective transitions, no false positives were observed. If the compounds having non-selective transitions are deleted from the data set the false positive rate (applying a 50% ion ratio and 0.2 min RT tolerance) drops from 2.1% (Table 6) to 0.8%. This clearly indicates that ion transition selection is a critical step and that non-selective ion transitions should be omitted. The false negative rate as a result of ion ratio

Figure 6. Overview of ion ratio deviation versus (a) the ion ratio and (b) the peak area of the second product ion for all compounds in all matrices reported by the participants in the collaborative study (n = 1038). Table 6. Overall false positive rate and false negative rate as a result of ion ratio deviations in the collaborative study. Ion ratio deviation criterion

False negative rate (%)

Liver

Muscle

Urine

Overall

Liver (n = 382)

Muscle (n = 378)

Urine (n = 278)

Overall (n = 1047)

1.5 1.7 1.7 3.1

1.7 1.7 1.7 2.9

1.8 2.3 2.7 5.5

1.9 1.9 2.1 3.9

3.7 1.3 1.0 0

8.0 5.9 5.2 0

5.0 2.9 2.2 0

5.6 3.4 2.9 0

Drug Test. Analysis 2016, 8, 477–490

Copyright © 2016 John Wiley & Sons, Ltd.

wileyonlinelibrary.com/journal/dta

489

≤30% ≤40% ≤50% none

Overall false positive rate (%)

Drug Testing and Analysis

B. J. A. Berendsen et al.

deviations is affected to a lesser extent (dropping from 2.9% to 2.0%). It is concluded that this empirical study supports the data obtained in the theoretical study[13] and therefore, to avoid false positive findings, it is proposed that the use of product ions at m/z 91, 105 and 121 and neutral losses of -17 and -18 Da should be omitted.

[4]

[5]

Amendment of guidelines This is a unique study focusing on the confirmatory analysis of a broad range of veterinary drugs in realistic, yet challenging matrices. Furthermore, it includes instrumentation of different vendors and the outcomes were assessed by a collaborative study. Currently revision of EC/2002/657[1] is under discussion and we recommend that the outcomes of this study are used as valuable input. Also the outcomes should be a basis for revision of Annex C of guideline CAC/GL 71-2009.[12]

[6] [7] [8]

[9] [10]

Conclusions Based on an initial study and the collaborative study which included laboratories from three continents, challenging sample extracts with limited sample purification and a large variation in compounds and concentration levels, the current criteria, established in 2002/657/EC[1] were critically assessed. It was concluded that this empirical study supports the data obtained in the theoretical study[13] and therefore, to avoid false positive findings, it is proposed that the use of product ions at m/z 91, 105, and 121 and neutral losses of -17 and -18 Da should be omitted for confirmatory analysis. New confirmatory criteria are proposed. It was observed that RT shifts, when using gradient elution, as is common practice nowadays, are mainly observed for early eluting compounds. Therefore an absolute RT criterion of ±0.2 min instead of a relative one is better suited. Furthermore, it was concluded that the ion ratio deviation is not related to the ion ratio itself, but rather to the intensity of the lowest product ion. Therefore a fixed ion ratio deviation tolerance reflects the real life situation better. A new criterion for ion ratio deviation of 50% is proposed.

[11] [12]

[13] [14]

[15]

Acknowledgements [16]

This research was financed by the UK ministerial Department for Environment, Food and Rural Affairs (Defra) within the framework of project VMD 1201 R3, Performance criteria for multi-residue analytical methods and partly sponsored by Thermo Scientific and Waters Corporation. We thank Jack Kay, Sam Fletcher, and Emma Thompson (Veterinary Medicines Directorate, London, UK); Sara Stead (Waters Corporation, Manchester, UK); and Markus Kellmann (Thermo Scientific, Bremen, DE) for their advice. Finally, we thank all collaborators for voluntarily collaborating to this study in constructing a unique and highly relevant dataset.

[17]

[18]

[19]

References

490

[1] European Commission. Commission Decision 2002/657/EC implementing Council Directive 96/23/EC concerning the performance of analytical methods and the interpretation of results. Off. J. Eur. Communities. 2002, L221, 8–36. [2] FDA (Food and Drug Administration). Final Guidance for Industry: Mass Spectrometry for Confirmation of the Identity of Animal Drug Residues. Division of Residue Chemistry, Office of Research, Center for Veterinary Medicine, Food Drug Administration: Rockville, MD, 2003. [3] European Commission. Commission Regulation EU/589/2014 laying down methods of sampling and analysis for the control of levels of

wileyonlinelibrary.com/journal/dta

[20]

dioxins, dioxin-like PCBs and non-dioxin-like PCBs in certain foodstuffs. Off. J. Eur. Communities L, 2014, 164, 18–40. Sanco. SANCO/12571/2013, 19 November 2013, rev. 0, Guidance document on analytical quality control and validation procedures for pesticide residues analysis in food and feed. European Commission: Brussels, 2013. WADA (The World Anti-Doping Agency). TD 2010 IDCR, Identification criteria for qualitative assays incorporating column chromatography and mass spectrometry. WADA: Montreal, Canada, 2010. IOC (International Olympic Committee). Analytical criteria for reporting low concentrations of anabolic steroids. Internal communication to IOC accreditated laboratories, Lausanne, Switserland. IOC, Geneva, 1998. United Kingdom Laboratory Guidelines for Legally Defensible Workplace Drug Testing: Urine Drug Testing. Version 1.0. European Workplace Drug Testing Society: London, 2002. AOAC (Association of Official Racing Chemists). AOAC Guidelines for the Minimum Criteria for Identification by Chromatography and Mass Spectrometry: MS Criteria Working Group - May 2011. AOAC: Rockville, MD, 2011. Society of Forensic Toxicology/American Academy of Forensic Sciences. Forensic Toxicology Laboratory Guidelines. Seattle, 2006. H. G. J. Mol, P. Zomer, M. García López, R. J. Fussell, J. Scholten, A. de Kok, A. Wolheim, M. Anastassiades, A. Lozano, A. Fernandez Alba. Identification in residue analysis based on liquid chromatography with tandem mass spectrometry: Experimental evidence to update performance criteria. Anal. Chim. Acta. 2015, 873, 1. S. J. Lehotay, Y. Sapozhnikova, H. G. J. Mol. Current issues involving screening and identification of chemical contaminants in foods by mass spectrometry. TrAC Trends Anal. Chem. 2015, 69, 62. CAC/GL 71-2009. Guidelines for the design and implementation of national regulatory food safety assurance programme associated with the use of veterinary drugs in food producing animals. Codex Alimentarius Commission: Rome, 2009. B. J. A. Berendsen, A. A. M. Stolker, M. W. F. Nielen. The (un)Certainty of selectivity in liquid chromatography tandem mass spectrometry. J. Am. Soc. Mass Spectrom. 2013, 24, 154. A. Schürmann, V. Dvorak, C. Crüzer, P. Butcher, A. Kaufmann. Falsepositive liquid chromatography/tandem mass spectrometric confirmation of sebuthylazine residues using the identification points system according to EU directive 2002/657/EC due to a biogenic insecticide in tarragon. Rapid Commun. Mass Spectrom. 2009, 23, 1196. F. André, K. K. G. De Wasch, H. F. De Brabander, S. R. Impens, L. A. M. Stolker, L. van Ginkel, R. W. Stephany, R. Schilt, D. Courtheyn, Y. Bonnaire, P. Fürst, P. Gowik, G. Kennedy, T. Kuhn, J.-P. Moretain, M. Sauer. Trends in the identification of organic residues and contaminants: EC regulations under revision. TrAC Trends Anal. Chem. 2001, 20, 435. B. J. A. Berendsen, L. A. M. Stolker, M. W. F. Nielen. Selectivity in the sample preparation for the analysis of drug residues in products of animal origin using LC-MS. TrAC Trends Anal. Chem. 2013, 43, 229. M. E. Dasenaki, A. A. Bletsou, G. A. Koulis, N. S. Thomaidis. Qualitative multiresidue screening method for 143 veterinary drugs and pharmaceuticals in milk and fish tissue using liquid chromatography quadrupole-time-of-flight mass spectrometry. J. Agric. Food Chem. 2015, 63, 4493. H. Wei, Y. Tao, D. Chen, S. Xie, Y. Pan, Z. Liu, L. Huang, Z. Yuan. Development and validation of a multi-residue screening method for veterinary drugs, their metabolites and pesticides in meat using liquid chromatography-tandem mass spectrometry. Food Addit. Contam. 2015, 32, 686. M. J. Schneider, S. J. Lehotay, A. R. Lightfield. Evaluation of a multi-class, multi-residue liquid chromatography – tandem mass spectrometry method for analysis of 120 veterinary drugs in bovine kidney. Drug Test. Anal. 2012, 4, 91. A. Kaufmann, M. Widmer, K. Maden. Signal suppression can bias selected reaction monitoring ratios. Implications for the confirmation of positive findings in residue testing. Rapid Commun. Mass Spectrom. 2014, 28, 899.

Supporting information Additional supporting information may be found in the online version of this article at the publisher’s web site.

Copyright © 2016 John Wiley & Sons, Ltd.

Drug Test. Analysis 2016, 8, 477–490