Identification of Autism Spectrum Disorder using Deep

0 downloads 0 Views 534KB Size Report
... School of Humanities, Porto Alegre 90619, Rio Grande do Sul, Brazil.; fCenter for the Developing Brain, Child Mind Institute, New York, New York 10022,.
Identification of Autism Spectrum Disorder using Deep Learning and the ABIDE Dataset Supplementary Material Anibal Sólon Heinsfelda , Alexandre Rosa Francob,c,d , R. Cameron Craddockf,g , Augusto Buchweitzb,d,e , and Felipe Meneguzzia,b a PUCRS, School of Computer Science, Porto Alegre 90619, Rio Grande do Sul, Brazil.; b PUCRS, Brain Institute of Rio Grande do Sul (BraIns), Porto Alegre 90619, Rio Grande do Sul, Brazil.; c PUCRS, School of Engineering, Porto Alegre 90619, Rio Grande do Sul, Brazil.; d PUCRS, School of Medicine, Porto Alegre 90619, Rio Grande do Sul, Brazil.; e PUCRS, School of Humanities, Porto Alegre 90619, Rio Grande do Sul, Brazil.; f Center for the Developing Brain, Child Mind Institute, New York, New York 10022, USA.; g Nathan Kline Institute for Psychiatric Research, Orangeburg, New York 10962, USA.

This manuscript was compiled on June 30, 2017

Table 2. Leave-site-out 5-fold cross-validation results using Random Forest

1. Methods We carried out additional experiments using more traditional machine learning methods, namely, Support Vector Machines [1] and Random Forest classifiers [2]. We also employed other parcellation methods for comparison with the results from the CC200 scheme used in the paper. Dimensionality reduction techniques were also run to visualize the distribution of the classification. We describe these methods in this section and the results in Section 2.

2. Results Additional analyses were carried out by training other classifiers using 5-fold cross validation on the same dataset. Specifically, we re-trained a DNN using this data arrangement. The results are shown in Table ??; we also trained an SVM classifier, results in Table 1; and trained a random forest classifier, results in Table 2. Table 1. Leave-site-out 5-fold cross-validation results using SVM Site-Out CALTECH CMU KKI LEUVEN MAX_MUN NYU OHSU OLIN PITT SBL SDSU STANFORD TRINITY UCLA UM USM YALE Mean

Size 37 27 48 63 52 175 26 34 56 30 36 39 47 98 140 71 56 60

Accuracy 0.60 0.60 0.61 0.60 0.60 0.64 0.61 0.59 0.59 0.60 0.60 0.60 0.60 0.56 0.60 0.56 0.60 0.60

Sensitivity. 0.86 0.88 0.83 0.86 0.83 0.74 0.85 0.89 0.89 0.85 0.84 0.87 0.83 0.96 0.87 0.95 0.88 0.86

Specificity 0.32 0.32 0.38 0.33 0.37 0.54 0.35 0.28 0.27 0.33 0.36 0.32 0.36 0.14 0.31 0.13 0.30 0.32

To exclude the possibility that the parcellation scheme accounts for the differences in classification, we trained classifiers using features generated with other parcellations, namely, the Automated Anatomical Labeling (AAL) atlas and The Dosenbach 160 atlas. The AAL atlas distributed with the AAL Heinsfeld et al.

Site-Out CALTECH CMU KKI LEUVEN MAX_MUN NYU OHSU OLIN PITT SBL SDSU STANFORD TRINITY UCLA UM USM YALE Mean

Size 37 27 48 63 52 175 26 34 56 30 36 39 47 98 140 71 56 60

Accuracy 0.63 0.65 0.63 0.65 0.65 0.65 0.65 0.65 0.65 0.65 0.65 0.63 0.64 0.66 0.61 0.62 0.62 0.64

Sensitivity 0.65 0.68 0.64 0.68 0.67 0.61 0.70 0.69 0.68 0.66 0.66 0.68 0.66 0.70 0.61 0.68 0.64 0.66

Specificity 0.61 0.61 0.63 0.62 0.63 0.68 0.61 0.62 0.60 0.63 0.64 0.59 0.61 0.62 0.61 0.55 0.60 0.61

Toolbox [3] was fractionated to functional resolution (3x3x3 mm3) using nearest-neighbor interpolation. The Dosenbach 160 atlas distributed with DPARSF/DPABI includes 160 4.5mm radius spheres placed at coordinates from Table S6 in Dosenbach et al. [4]. These regions were identified from metaanalyses of task-related fMRI studies. The results suggest that the CC200 parcellation may increase accuracy for our classifier, but the relative accuracies among the classifiers remained the same. The results are shown in Table 3. Table 3. Leave site out 5-fold cross-validation using other parcellation masks.

Parcellation CC200 AAL Dosenbach160

Accuracy 0.70 0.66 0.62

Sensitivity 0.74 0.75 0.67

Specificity 0.63 0.58 0.57

Finally, to visualize the distribution of the deep learning classification, we employed the t-Distributed Stochastic Neighbor Embedding (t-SNE) [5] dimensionality reduction technique to project TC and ASD subject classifications onto 2D and

3D spaces. We generated such projections from both Autoencoders used in our classifiers, as shown in Table 4. The visualization illustrates how narrowing the features into the 600 neurons at the end of the two autoencoders helped to discriminate between clusters of subjects. Table 4. The t-SNE projection of ABIDE dataset, generated with Google Tensorboard. t-SNE params: perplexity:9, learning rate: 100, 268 iterations Orange: TC, Blue: ASD

Autoencoder

2D

1. Vapnik V (1998) The support vector method of function estimation in Nonlinear Modeling. (Springer), pp. 55–85. 2. Ho TK (1995) Random decision forests in Document Analysis and Recognition, 1995., Proceedings of the Third International Conference on. (IEEE), Vol. 1, pp. 278–282. 3. Tzourio-Mazoyer N et al. (2002) Automated anatomical labeling of activations in {SPM} using a macroscopic anatomical parcellation of the {MNI} {MRI} single-subject brain. NeuroImage 15(1):273 – 289. 4. Dosenbach NUF et al. (2010) Prediction of individual brain maturity using fmri. Science 329(5997):1358–1361. 5. Maaten Lvd, Hinton G (2008) Visualizing data using t-sne. Journal of Machine Learning Research 9(Nov):2579–2605.

3D

1st: 1000 features

2nd: 600 features

Heinsfeld et al.

Heinsfeld et al.