Mining mouse behavior for patterns predicting ...

3 downloads 0 Views 537KB Size Report
Neri Kafkafi & Cheryl L. Mayo & Greg I. Elmer. Received: 11 April 2013 ...... Brady DL (2005) Prenatal exposure to a repeated variable stress paradigm elicits ...
Mining mouse behavior for patterns predicting psychiatric drug classification

Neri Kafkafi, Cheryl L. Mayo & Greg I. Elmer

Psychopharmacology ISSN 0033-3158 Psychopharmacology DOI 10.1007/s00213-013-3230-6

1 23

Your article is protected by copyright and all rights are held exclusively by SpringerVerlag Berlin Heidelberg. This e-offprint is for personal use only and shall not be selfarchived in electronic repositories. If you wish to self-archive your article, please use the accepted manuscript version for posting on your own website. You may further deposit the accepted manuscript version in any repository, provided it is only made publicly available 12 months after official publication or later and provided acknowledgement is given to the original source of publication and a link is inserted to the published article on Springer's website. The link must be accompanied by the following text: "The final publication is available at link.springer.com”.

1 23

Author's personal copy Psychopharmacology DOI 10.1007/s00213-013-3230-6

ORIGINAL INVESTIGATION

Mining mouse behavior for patterns predicting psychiatric drug classification Neri Kafkafi & Cheryl L. Mayo & Greg I. Elmer

Received: 11 April 2013 / Accepted: 25 July 2013 # Springer-Verlag Berlin Heidelberg 2013

Abstract Rationale In psychiatric drug discovery, a critical step is predicting the psychopharmacological effect and therapeutic potential of novel (or repurposed) compounds early in the development process. This process is hampered by the need to utilize multiple disorder-specific and labor-intensive behavioral assays. Objectives This study aims to investigate the feasibility of a single high-throughput behavioral assay to classify psychiatric drugs into multiple psychopharmacological classes. Methods Using Pattern Array, a procedure for data mining exploratory behavior in mice, we mined ~100,000 complex movement patterns for those that best predict psychopharmacological class and dose. The best patterns were integrated into a classification model that assigns psychopharmacological compounds to one of six clinically relevant classes—antipsychotic, antidepressant, opioids, psychotomimetic, psychomotor stimulant, and α-adrenergic. Results Surprisingly, only a small number of well-chosen behaviors were required for successful class prediction. One of them, a behavior termed “universal drug detector”, was dosedependently decreased by drugs from all classes, thus providing a sensitive index of psychopharmacological activity. In independent validation in a blind fashion, simulating the

Electronic supplementary material The online version of this article (doi:10.1007/s00213-013-3230-6) contains supplementary material, which is available to authorized users. N. Kafkafi (*) Department of Zoology, Tel Aviv University, Ramat-Aviv, Tel-Aviv 69978, Israel e-mail: [email protected] C. L. Mayo : G. I. Elmer Department of Psychiatry and the Maryland Psychiatric Research Center, University of Maryland School of Medicine, Baltimore, MD, USA

process of in vivo pre-clinical drug screening, the classification model correctly classified nine out of 11 “unknown” compounds. Interestingly, even “misclassifications” match known alternate therapeutic indications, illustrating drug “repurposing” potential. Conclusions Unlike standard animal models, the discovered classification model can be systematically updated to improve its predictive power and add therapeutic classes and subclasses with each additional diversification of the database. Our study demonstrates the power of data mining approaches for behavior analysis, using multiple measures in parallel for drug screening and behavioral phenotyping. Keywords Animal model . Behavioral phenotyping . SEE . Open field . Spatial behavior

Introduction There is a growing consensus that psychiatric central nervous system (CNS) drug discovery is largely failing to provide new medication alternatives (Conn and Roth 2008; Schopp 2011). The lack of a clear neuropathology and the complexity of the emotional and cognitive disturbances make the use of behavioral animal models particularly challenging (Agid et al. 2007; Pangalos et al. 2007; Nestler and Hynan 2010). The in vivo capacity to screen novel chemical entities is further hampered by the need to utilize multiple, often labor intensive, behavioral assays, each limited to identifying a narrow drug class or psychiatric disorder using a small set of endpoints. It seems reasonable to expect that some properties in the free movement of a mouse would indicate whether it was injected with, e.g., an antidepressant, an antipsychotic, or an opioid compound. Complex locomotor patterns have been suggested for classification of compounds in animals and humans (Geyer et al. 1986; Henry et al. 2010), but no standard behavioral assay

Author's personal copy Psychopharmacology

has been shown to consistently identify and differentiate multiple psychopharmacological classes. Such an assay, especially if it is of high throughput and reliable, would be very useful as an in vivo screening procedure for identifying the therapeutic potential of novel compounds, as well as for discovering new uses for existing compounds (repurposing). In recent years, data mining classification strategies have been successfully applied at the molecular and physiological level. For example, large gene-expression data sets are mined for profiles that are predictive of a toxicological or carcinogenic response (“class predictors”, e.g. Golub et al. 1999). This in vitro approach was used to discover gene-expression patterns that classify psychoactive drugs (Gunther et al. 2003). A similar approach for in vivo mining at the level of behavior was proposed in previous reviews (Brunner et al. 2002; Tecott and Nestler 2004). Rihel et al. (2010) recently applied this approach to construct a high-throughput behavioral assay in zebra fish, but their typical prediction success of a compound's therapeutic value was only slightly better than chance level. The difficulty seems to be that the natural units of behavior are not as well defined and understood as those measured at the molecular and physiological levels, such as, e.g., highly annotated genes in gene expression profiling. Therefore, the key to successful behavior mining is in the design of a useful “behavioral chip”, i.e., a proper categorization of the data into multiple types of behavior that can be mined. Such a categorization of “open field” behavior in mice and rats was proposed in the PA method (Kafkafi et al. 2009) by using different combinations of ten well-studied, ethologically relevant attributes (Drai and Golani 2001; Kafkafi et al. 2005; Kafkafi and Elmer 2005a, b; Benjamini et al. 2010) to define ~100,000 complex movement patterns. PA patterns range from general, e.g., “moving near the wall of the arena”, to more specific definitions, e.g., “moving near the wall while braking hard and sharply turning away from the wall”. This last pattern was actually discovered by PA to diagnose SOD1 rats, an animal model of amyotrophic lateral sclerosis, at much earlier age than any standard behavioral measure or even a human observer (Kafkafi et al. 2008). The large number and rich complexity of patterns used by PA increases the probability that some of them can serve as reliable predictors of any particular drug effect of interest. The most predictive and reliable behaviors can be mined in a behavior database of animals injected with drugs of known psychopharmacological class and mechanism of action, and integrated into a classification model. As we demonstrate here, this classification model provides for the first time a single behavioral assay capable of predicting, with high rate of success, the therapeutic application of a drug among multiple, therapeutically relevant psychopharmacological classes. This study also tests the feasibility of systematically updating and increasing the power of the classification model simply by adding compounds and classes to the PA database.

Materials and methods Testing and algorithmic methods were described in detail in previous PA studies (Kafkafi et al. 2008; Kafkafi et al. 2009) and are summarized here in brief, except where differences and improvements were applied. Animals, drugs, and experimental procedures Drug-injected animals were all 60- to 80-day-old C57BL/6J male mice (Jackson Laboratories) acclimated to the animal facility for at least 7 days before testing. They were housed at five animals per cage in standard conditions of 12:12 light cycle, 22 °C room temperature, and water and food ad libitum. A total of 41 drugs representing six drug classes (psychomotor stimulant, opioid, psychotomimetic, antidepressant, antipsychotic, and α-adrenergic) were investigated in this study (Table 1). Open field testing took place during the light phase of the cycle. Each animal was injected once and immediately introduced into a 2.50-m-diameter circular open-field arena where its location was video-tracked for 60 min using Noldus EthoVision®. The {time, X, Y} coordinates of the path were exported and analyzed using the standard procedure in Software for the Exploration of Exploration (SEE; see Drai and Golani 2001; Kafkafi et al. 2005). The experimental protocols followed the “Principles of Laboratory Animal Care” (NIH publication no. 86-23, 1996). The animals used in this study were maintained in facilities fully accredited by the American Association for the Accreditation of Laboratory Animal Care. Datasets Four datasets were used in the analysis, two of which were gathered especially for the present study, while the other two were gathered in previous studies (Table 1). Dataset I was recorded for both training and validation phases of the study of Kafkafi et al. (2009) and included animals injected with 13 drugs, each including several dose groups, belonging to the psychomotor stimulant, opioid, and psychotomimetic drug classes. Dataset II was recorded for the mining phase of the present study and included 17 additional drugs, each including several dose groups, belonging to the α-adrenergic, antidepressant, and antipsychotic drug classes. Dataset III was recorded for the validation of the classification model in the present study and included 11 additional drugs belonging to the antidepressant, antipsychotic, psychomotor stimulant, and opioid classes. Dataset VI included drug-naïve animals from ten inbred strains across three laboratories, recorded in the frame of a previous study (Kafkafi et al. 2005) and used here to estimate the heritability and replicability of identified behavioral patterns across laboratories.

Author's personal copy Psychopharmacology Table 1 List of drugs Drugs

Doses (mg/kg)

Dataset I

II

III

Psychomotor

The 41 drugs in this study were divided into six classes and three datasets: psychomotor stimulants, opioids, psychotomimetic, antidepressant, antipsychotic, adrenergic; X included in this dataset, + correctly classified by PA, – incorrectly classified by PA, ± correctly limited by PA into two classes

Cocaine Methamphetamine Methylphenidate Mazindol Modafinil Apomorphine Nomifensine Opioid Morphine Oxycodone Fentanyl Codeine BW373U86 Buprenorphine Hydromorphone Psychotomimetic PCP SDZ220851 Ketamine

3.0, 5.6, 10.0, 17.0, 30.0 0.3, 1.0, 1.7, 3.0 1.7, 5.6, 10.0, 17.0 1.0, 3.0, 5.6, 10.0 30, 56, 100, 170 0.17, 0.3, 0.56, 1.0, 3.0 3.0, 5.6, 10.0, 20.0

X X X X X

1.0, 3.0, 5.6, 10.0 1.0, 3.0, 5.6 0.056, 0.1, 0.17, 0.3 5.6, 17.0, 30.0, 56.0 20.0 0.17, 0.3, 0.56, 1.0, 3.0 0.3, 1.0, 3.0

X X X X

3.0, 5.6 1.7, 3.0 ,5.6 5.6, 10.0, 17.0

X X X

Salvinorin A MK801 Memantine α-Adrenergic Yohimbine Idazoxan Atipemazole Antidepressant Fluoxetine Paroxetine Imipramine Desipramine Nortriptyline Bupropion Venlafaxine Maprotiline Amitriptyline Antipsychotic

1.0, 3.0, 10.0 0.1, 0.3, 1.0 10.0, 20.0, 30.0

X

Haloperidol Chlorpromazine Amoxapine Clozapine Olanzapine Risperidone Droperidol Loxapine Eticlopride

– +

X ± +

X X

0.3, 3.0, 5.6 3.0 ,5.6, 10.0 10.0, 17.0, 30.0

X X X

10.0, 20.0, 30.0 10.0, 20.0, 30.0 10.0, 20.0, 30.0 5.0,10.0, 20.0, 30.0 3.0 ,5.6, 10.0 10.0, 17.0, 30.0 30.0, 40.0 10.0, 20.0, 30.0 10.0, 20.0

X X X X X X

0.1, 0.3 0.1, 0.17, 0.3 3.0, 5.6 1.0, 1.7, 2.3, 3.0 0.3, 1.0 0.25, 0.56, 1.00 0.1, 0.3, 1.0 1.7, 3.0, 5.6 0.03, 0.17, 0.3

X X X X X

– + +

+ + + +

Author's personal copy Psychopharmacology

Drugs In total, 41 drugs representing six drug classes (psychomotor stimulant, opioid, psychotomimetic, antidepressant, antipsychotic, and α-adrenergic) were investigated in this study (Table 1). Nomifensine maleate salt, apomorphine, morphine, hydromorphone hydrochloride, MK801, memantine hydrochloride, nortriptyline hydrochloride, venlafaxine hydrochloride, maprotiline hydrochloride, amitriptyline hydrochloride, risperidone, droperidol, loxapine succinate salt, eticlopride, and atipamezole were all purchased from Sigma-Aldrich (St. Louis, MO, USA). Nomifensine, apomorphine, hydromorphone, maprotiline, amitriptyline, loxapine, eticlopride, and atipamezole were dissolved in deionized water vehicle. Morphine, MK801, memantine, nortriptyline, buprenorphine, and venlafaxine were all dissolved in 0.9 % saline. Risperidone was dissolved in acetic acid and then brought to appropriate concentration with saline (acetic acid, 0.2 %). Droperidol was dissolved in Tween and ethanol and then brought to appropriate concentration with saline (Tween, 20 %/ethanol, 2 %). Vehicle solutions were used as control. Apomorphine, morphine, oxycodone, fentanyl, codeine, salvinorin A, and hydromorphone were given subcutaneously (s.c.); all other drugs were given intraperitoneally (i.p.). All drugs were given at an injection volume of 10.0 ml/kg. The classification of the drugs (Table 1) could be based on a pharmacological structure–activity basis, therapeutic application, or both. Since our goal was to predict a drug's clinical relevance, we have classified drugs based largely upon their clinical application in the case of known therapeutic drugs or their affective psychopharmacological effects in the case of experimental or abused drugs. Several notable examples of the discrepancy between known pharmacological and known classified therapeutic or commonly characterized affective psychopharmacological effect were buproprion (DA uptake inhibitor in antidepressant class) and Salvinorin A (kappa opioid agonist in psychotomimetic class). We left psychomotor stimulants and α-adrenergics as pharmacological classes since their utility in the therapeutic setting did not cover most of the tested drugs in the case of the psychomotor stimulants, and their utility in psychiatry was limited, nevertheless a factor in some drug profiles, in the case of the adrenergics. In addition to the potential classification conflicts cited above, it is recognized that many of the drugs are used for more than one indication. In all cases, we have utilized the primary indication for classification. PA analysis The PA algorithm was described in detail in Kafkafi et al. (2008) and in Kafkafi et al. (2009). Briefly, each path coordinate out of the progression of the animal was represented by ten dynamical attributes, such as speed, acceleration, direction of movement, and direction change. The range of each attribute is

partitioned into several indexed bins, and patterns of movement are then defined as combinations of bins from one or more attributes. We code these patterns using the same order as in the list of attributes and using asterisks (standing for “wildcards”) to denote attributes that can accept any value and are therefore irrelevant to the specific pattern definition. As more attributes are added to the definition of a pattern, it becomes more and more specific, e.g., the four-attribute pattern P{*,*,1,2,*,1,5,*,*,*} means “moving very slowly while slightly decelerating in the direction of the arena wall but turning sharply away from it”. As in a previous work, we do not consider patterns of more than four attributes because this would amount to an astronomical number of combinations, most of them so over-specified that they rarely occur in normal behavior. All possible bin combinations of up to four attributes amount to a total of 73,042 different behavior patterns. The animal use of each pattern is computed as the total time it spent in this pattern, expressed as percentage out of its total progression time during the session, using a logit transformation commonly used to transform ratios. Mining of patterns is performed by testing the difference between experimental groups in each pattern, employing the animal use as the dependent variable. Mining and validation strategy In this study, we use data mining to establish a classification model (for a comprehensive introduction, see Tan et al. (2006)), separating six psychopharmacological classes of clinical importance: antidepressants, antipsychotics, α-adrenergics, psychomotor stimulants, opioids, and psychotomimetics. We used data from a previous study (Table 1, dataset I; Kafkafi et al. 2009) as well as new data (Table 1, dataset II). We also utilized another previously collected dataset of naïve mice of ten inbred strains across three laboratories (Kafkafi et al. 2005) to further screen discovered predictor behaviors for high heritability and replicability. The data, consisting of path coordinates of the mice in the arena, are first quantified into a large number of behavior patterns, and the frequency of using each pattern by each animal is measured. The classification model is then “trained” by mining for patterns that best discriminate these drug classes. Finally, we validate the model by its ability to correctly classify additional drugs that it did not encounter during the training process (Table 1, dataset III), a simulation of novel compound classification in drug discovery. A hierarchical strategy is adopted in this study in order to navigate the larger number of drug classes. In the first level, we mine for general predictors capable of overall dose–response detection and good separation of all classes in the database. As seen in Fig. 1, just two predictor patterns were sufficient to limit a compound into one, two, or (rarely) three likely classes. These remaining ambiguities are then resolved in the second level using “discriminator patterns”, each mined

Author's personal copy Psychopharmacology

Mining -20

-18

-16

-14

-12

-10

-8

Validation -6

-4

-2

0

2

α-adrenergic

-20

-18

-10

-8

MK801 Morphine

-2 Modafinil

Opioid

0

Cocaine Mazindol Methylphenidate

-4 -6

General Class Identifier

General Class Identifier

Vehicle

Ketamine SDZ SER-082 Phencyclidine Codeine

2

4

Droperidol

2

Loxapine Eticlopride

2%

0 -2

Buprenorphine Venlafaxine

-4

1%

-6 Nomifensine

-8

0.5%

Fentanyl Methamphetamine Oxycodone

Psychomotor

-10

10

-12

-12

-14 2%

0

8

-8 0.5%

-2

Haloperidol

Psychotomimetic

1%

-4

6

2 2%

-6

Risperidone Amitriptyline Maprotiline

4

Antidepressant

-12

5%

6

Antipsychotic

-14

Apomorphine

8

5%

-16

5%

10%

20%

30%

Hydromorphone 2%

Universal Drug Detector

-14 5%

10%

20%

30%

Universal Drug Detector

Fig. 1 Separation of classes in the plane of the universal drug detector (horizontal axis) vs the general class identifier (vertical axis) in mining datasets I and II (left) and in validation dataset III (right). Each arrow denotes a dose–response curve, going out from vehicle through several doses. Each dose group is represented by its median. Axis units are given

in both (logit-scaled) percentage of pattern use out of total progression time of the animal in the session (primary axes) and in vehicle standard deviations (secondary axes). Doses are detailed in Table 1 using the same class color coding. The antidepressant, antipsychotic, and α-adrenergic drugs are further separated in Fig. 2

for the best discrimination of just two classes. As in the previous study, patterns were mined mainly by their p-values in standard statistical tests, but the specific tests used in the present study were slightly different (see “Results”). As in previous studies, we addressed the statistical issue of multiple comparisons (e.g., Benjamini and Yekutieli 2005) by employing the most conservative multiple comparisons criterion, the Bonferroni criterion. This criterion is a corrected significance level of α/n, where n is the number of comparisons, thus ensuring that the probability of even a single “false positive” is less than α. The comparison of 73,042 different patterns using α = 0.05 yields a Bonferroni criterion of p