Chapter 1

1 downloads 0 Views 5MB Size Report
Chapter 3. Direct and Indirect Benefits of Translingual. Neurostimulation Technology for ... Materials and Methods: Participants (N = 5) enrolled into a 13-month ..... these patterns of single unit firings, a proprietary software program was used to ...... such as shopping, housework and grooming as well as a renewed sense of.
Complimentary Contributor Copy

Complimentary Contributor Copy

COMPUTER SCIENCE, TECHNOLOGY AND APPLICATIONS

BRAIN-MACHINE INTERFACES USES AND DEVELOPMENTS

No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form or by any means. The publisher has taken reasonable care in the preparation of this digital document, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained herein. This digital document is sold with the clear understanding that the publisher is not engaged in rendering legal, medical or any other professional services.

Complimentary Contributor Copy

COMPUTER SCIENCE, TECHNOLOGY AND APPLICATIONS Additional books in this series can be found on Nova’s website under the Series tab.

Additional e-books in this series can be found on Nova’s website under the eBooks tab.

Complimentary Contributor Copy

COMPUTER SCIENCE, TECHNOLOGY AND APPLICATIONS

BRAIN-MACHINE INTERFACES USES AND DEVELOPMENTS

CARLA BRYAN AND

IVAN RIOS EDITORS

Complimentary Contributor Copy

Copyright © 2018 by Nova Science Publishers, Inc. All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or otherwise without the written permission of the Publisher. We have partnered with Copyright Clearance Center to make it easy for you to obtain permissions to reuse content from this publication. Simply navigate to this publication’s page on Nova’s website and locate the “Get Permission” button below the title description. This button is linked directly to the title’s permission page on copyright.com. Alternatively, you can visit copyright.com and search by title, ISBN, or ISSN. For further questions about using the service on copyright.com, please contact: Copyright Clearance Center Phone: +1-(978) 750-8400 Fax: +1-(978) 750-4470 E-mail: [email protected].

NOTICE TO THE READER The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or in part, from the readers’ use of, or reliance upon, this material. Any parts of this book based on government reports are so indicated and copyright is claimed for those parts to the extent applicable to compilations of such works. Independent verification should be sought for any data, advice or recommendations contained in this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage to persons or property arising from any methods, products, instructions, ideas or otherwise contained in this publication. This publication is designed to provide accurate and authoritative information with regard to the subject matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering legal or any other professional services. If legal or any other expert assistance is required, the services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS. Additional color graphics may be available in the e-book version of this book.

Library of Congress Cataloging-in-Publication Data ISBN:  H%RRN

Published by Nova Science Publishers, Inc. † New York

Complimentary Contributor Copy

CONTENTS Preface Chapter 1

Chapter 2

Chapter 3

vii Advances in the Development of a Speech Prosthesis P. R. Kennedy, A. J. Cervantes, C. Gambrell and P. Ehirim EEG Pattern Differences in Motor Imagery Based Control Tasks Used for Brain-Computer Interfacing: From Training Sessions to Online Control Luz Maria Alonso-Valerdi and Francisco Sepulveda Direct and Indirect Benefits of Translingual Neurostimulation Technology for Neurorehabilitation of Chronic Stroke Symptoms Dafna Paltin, Yuri P. Danilov and Mitchell E. Tyler

Complimentary Contributor Copy

1

43

69

vi Chapter 4

Contents Essential Trick for Long-Term Recording from the Human Brain Phil Kennedy

Index

85 91

Complimentary Contributor Copy

PREFACE Brain-Machine Interfaces: Uses and Developments reports on advances in the development of a speech prosthetic, building on previous data as well as the results of detecting phonemes, words and phrases during overt and covert speech. The following study aims to quantify and qualify the electroencephalographic (EEG) patterns of commonly used control tasks in BCI systems under different task states. The analysed control tasks were: left hand MI, right hand MI, and a relaxed but focused mental state. The original feasibility study within this manuscript aimed to evaluate the scope of applications for a novel neurorehabilitation intervention. Important observations from that initial study and considers possible applications of TLNS Technology in the future are examined. The closing opinion piece seeks to outline why the development of an electrode that does not encourage growth into the electrode tip is ill-advised, with the core reasons being rejection and “less is more”. Chapter 1 - Advances in the development of a speech prosthesis are reported in this chapter. Two patients have been implanted: a locked-in, mute and paralyzed male in 2004 (ER) and a speaking male in 2014 (PK). ER’s data has been reported in various papers [10-14] including functional neural data at year nine (in press, 16). Building on ER’s data, motor and sensory relationships are described for subject PK, as well as the results of detecting phonemes, words and phrases during overt and covert speech.

Complimentary Contributor Copy

viii

Carla Bryan and Ivan Rios

Different techniques are described including using single unit data and single unit bursts, both from averaged and individual trials. A neural net Fitting program from Matlab using bursting units more reliably detects words and phrases than traditional single unit rate analysis. Once the Fitting program is trained there is minimal delay in detection of words of phrases which is required for near-conversational speech. A roadmap is described that outlines the steps needed to provide speech to mute individuals. Chapter 2 - Brain-computer interfaces (BCIs) are promising systems that attempt to replace the function of the brain output pathways by using the brain signals to control a device of interest. Investigating the control tasks (specifically motor imagery (MI) tasks) used to operate a BCI system under different demanding conditions may explain the difficulty to employ this type of system outside the laboratory. Therefore, the present study set out with the aim of quantifying and qualifying the electroencephalographic (EEG) patterns of commonly used control tasks in BCI systems under different task states. The analysed control tasks were three: left hand MI, right hand MI, and a relaxed but focused mental state. The different task states referred to eight different scenarios, whereby a random sample of eleven participants were guided from modulating their brain signals using MI related control tasks, to use those control tasks for selecting activities of daily living in simulated living situations. The EEG patterns were analysed in line with the EEG features that best differentiated among the three control tasks, and the electrophysiological origin (recording sites, frequency bands, and time windows) of those features. Taken together, the findings of this study highlight the impact of the human brain processing on the BCI system performance. It has been demonstrated that the EEG patterns of MI related control tasks are not only determined by MI activity per se, but they are also defined by the processing of internal (e.g., navigation strategy and decision making) and external (e.g., sensory stimuli or number of tasks to be attended) events associated with the working environment. The investigation of the environmental effects on the user control tasks is very important in order to achieve the desirable overt and covert adaption in BCI systems.

Complimentary Contributor Copy

Preface

ix

Chapter 3 - Objectives: The original feasibility study discussed within this manuscript aimed to evaluate the scope of applications for a novel neurorehabilitation intervention. This retrospective piece was composed several years after the completion of the study and focuses on two interesting subjects in particular whose opposing symptomatologies represent the vast spectrum of chronic symptoms after stroke. The present paper brings to light important observations from that initial study and considers possible applications of TLNS Technology for the future. Materials and Methods: Participants (N = 5) enrolled into a 13-month clinical intervention comprised of three components; (1) translingual neurostimulation (TLNS) using the PoNSTM device, (2) targeted training, and (3) physical exercise. These components will be elaborated on later in the paper. For the first six months of the intervention the participants practiced all three components for one hour, twice daily. Following a thirty-day withdrawal period, participants resumed exercise, training, and device use for an additional six months. Measurements of observation included the Sensory Organization Test (SOT), Dynamic Gait Index (DGI), Timed Up and Go (TUG) test, Stroke Impact Scale (SIS), Dysarthria Impact Profile (DIP), and the Quick Inventory of Depressive Symptoms, Self-Report (QIDS-DR16). Results: Two specific cases were selected because of their demonstrability that TLNS Technology, as a platform technology, is much wider than previously thought, thus opening more doors in the direction of integrative neuroscience and combination therapy. Conclusion: The authors hypothesize that the beneficial effects observed resulted from lasting and cumulative neuroplastic changes (functional, synaptic, and neuronal) in the brainstem and cerebellum at the cellular and neural network levels, elicited by powerful flow of neural impulses (spikes) from the tongue. The original feasibility study importantly breaks ground for scientific inquiry regarding the limits of rehabilitation and improvement of symptoms for individuals in the chronic stage of stroke recovery. The observations reported here present an opportunity for applications of a new non-invasive brain stimulation technique in the applied and rehabilitative neurosciences.

Complimentary Contributor Copy

x

Carla Bryan and Ivan Rios

Chapter 4 - Brain computer interfacing is becoming fashionable. All sorts of electrodes are being proposed for invasive recording that will connect the brain with a computer or other device to restore communication, movement or speech. Almost all invasive techniques involve piercing the cortex and recording with the hope that the signals will stay strong and functional for the lifetime of the subject. A hope is not a plan however. Data clearly shows that the only way to provide functional, life-long recording is by growing the neuropil into the electrode tip and tricking it to stay there. There is one exception to the bevy of electrodes now being researched. The exceptional electrode is the Neurotrophic Electrode, originally known as the Cone Electrode because the tip is a glass cone. This cone is 1 to 1.5 mm in length, 50 microns outer diameter at its tip and several hundred microns at the upper end where Teflon-insulated gold wires are inserted to record the neural firings. The key feature is the trophic factor(s) that entice the neuropil to enter and grow through the cone and join with the neuropil at the other end. Destruction of neuropil on insertion is also essential to encourage growth since destruction triggers release of endogenous growth factors. The result is that neuropil is trapped inside the cone and cannot escape because it has grown into the external neuropil at both ends of the cone! Nor can the neurobiological processes reject the cone! The biocompatible wires inside the biocompatible glass cone record the activity of the myelinated axons that form at about three weeks and provide stable recordings at three months and continue for a decade. Recordings with the Neurotrophic Electrode are long lasting indeed. Until now, no electrode had recorded for a decade. The data on the Neurotrophic Electrode provide evidence for not just the same single units being recorded but also that the units are functional and were conditioned at year nine. (The locked-in subject was too ill to record beyond year 10 and recently died due to his deteriorating health.) Other evidence is provided in the list of references below. The point of this brief opinion piece is to point out why the general development of any electrode that does not encourage growth into the electrode tip is misguided. There are two main reasons.

Complimentary Contributor Copy

In: Brain-Machine Interfaces ISBN: 978-1-53613-368-4 Editors: Carla Bryan and Ivan Rios © 2018 Nova Science Publishers, Inc.

Chapter 1

ADVANCES IN THE DEVELOPMENT OF A SPEECH PROSTHESIS P. R. Kennedy1,*, PhD, A. J. Cervantes2, MD, C. Gambrell3 and P. Ehirim4, MD 1

Neural Signals Inc., Duluth, GA, US Neurosurgical Clinic, Belize City, Belize 3 Neural Signals Inc, Duluth, GA, US 4 Neurosurgical Brain and Spine, Gwinnett Medical Center, Lawrenceville, GA, US 2

ABSTRACT Advances in the development of a speech prosthesis are reported in this chapter. Two patients have been implanted: a locked-in, mute and paralyzed male in 2004 (ER) and a speaking male in 2014 (PK). ER’s data has been reported in various papers [10-14] including functional neural data at year nine (in press, 16). Building on ER’s data, motor and sensory relationships are described for subject PK, as well as the results of detecting phonemes, words and phrases during overt and covert speech. Different techniques are described including using single unit *

Corresponding Author Email: [email protected].

Complimentary Contributor Copy

2

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al. data and single unit bursts, both from averaged and individual trials. A neural net Fitting program from Matlab using bursting units more reliably detects words and phrases than traditional single unit rate analysis. Once the Fitting program is trained there is minimal delay in detection of words of phrases which is required for near-conversational speech. A roadmap is described that outlines the steps needed to provide speech to mute individuals.

INTRODUCTION The effort to interface the human brain to a speech prosthetic that will provide at least 100 useful words or phrases at a near conversational rate is advancing. The implantable Neurotrophic Electrode is based on the ingrowth of neuropil into its 1.5 mm glass tip thereby securing the neural signals for long-term recording. Recent data show that the Neurotrophic electrode used for the interface provides long lasting recordings that were functional nine years after implantation [16]. This chapter primarily describes efforts to detect phonemes, words and phrases during overt and covert speech in a speaking individual. Brain to computer or machine interfaces are served by many approaches. External electrode placement as in electroencephalographic (EEG) recordings [17-19], brain surface recordings (Electrocorticography, ECOG) [20-23] or subsurface recordings such as with Tine type [24-27] or Neurotrophic electrodes [1-5, 9-13, 16] are available. External EEG approaches are emphasized because of their simplicity and safety. Subsurface techniques are advantageous when high resolution recordings are needed such as with fine control of paralyzed digits, fine control of robots or near-conversational rate speech. Invasive sub-cortical recordings are obviously more dangerous than external EEG electrodes, but not much more dangerous than ECOG, One difference being penetration 5 mm into the cortex. Furthermore, invasive electrodes are no more dangerous than placement of deep brain stimulation electrodes that are now in routine clinic use [28, 29]. The dangers of subcortical placement are hemorrhage, seizures, weakness and infection which are very rare in the hands of

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

3

qualified neurosurgeons. The Neurotrophic electrodes, being more bulky than tine type electrodes, will obviously damage more cortical neuropil. Since the electrodes are meant to be placed in non-functioning cortex of paralyzed and/or mute subjects, this damage may have little relevance unless it is extensive. Furthermore, the damage is required to generate growth of neuropil into the hollow 1.5 mm glass electrode tip. Neurotrophic factors are also important in enticing growth into the tip. Invasive type electrodes can record from single units whose firing patterns are thought to closely reflect the underlying cortical function, even dormant function that is present in paralyzed and mute people [9, 16]. While providing high resolution recordings, they may not be needed in situations such as communication with a computer where EEG or remnants of muscle activity (EMG) continue to provide a sufficient link [17-19]. In 2004, the invasive Neurotrophic electrode was implanted in a paralyzed male for studies on speech regeneration with recordings continuing for more than a decade [10-14, 16]. He remained paralyzed and mute until he died in 2017 due to his underlying deteriorating health. A major limitation during these studies was his mutism: It was not possible to discern the pattern of firing of units during overt speech since he had none. Using an upward eye movement, he did confirm whether or not he tried to speak covertly (in his head). Analysis of such covert recordings did yield useful data [10-14, 16]. However, the lack of overt and covert speech has long been recognized as a handicap during interpretation of these studies. Hypothetically, the pattern of firings during overt speech should map fairly closely to those during covert speech. To discern the relationship between brain activity and intended speech, it would be ideal to implant subjects prior to the time of speech loss. This, of course, would only be possible in patients with progressive diseases, such as amyotrophic lateral sclerosis (ALS) in whom the loss of communication abilities can be predicted. Such patients could be implanted, so that the single unit firing patterns, the link of such patterns to speech, and the stability of these links could be determined early, and be used as soon as needed. While early intervention in cases of diseases associated with speech decline may be advantageous,

Complimentary Contributor Copy

4

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

the obvious risk of hastening the loss of speech by the electrode placement exists. In the absence of suitable volunteers in early stages of a progressive speech production disorder, we chose to implant a speaking healthy volunteer (PK) to be implanted with Neurotrophic electrodes in the speech motor cortex, so that the relationship between brain activity and overt and covert speech could be more meaningfully explored. The data generated during this experiment add to those from subject ER, much of whose data have been published. Data from both subjects are presented here, demonstrating that phonemes, words and phrases can be detected when spoken overtly or covertly.

METHODS Electrode The electrode assembly and usage has been detailed elsewhere [9]. Briefly, 2 mil Teflon insulated gold wires are coiled around a pipette and glued with methyl-methacrylate inside a glass cone. The cone is made by pulling a heated pipette and obtaining the tip to the dimensions required which are 1.5 mm in length, 25 microns at the deep end and a few hundred microns at the upper end to allow space for the inserted wires. The other end of each coiled gold wire is soldered into a connector that will plug into the implanted electronic component. This is diagrammatically shown in Figure 1A.

Implanted Electronics We implanted three single channel amplifiers as we have done in previous studies. The amplifiers are assembled in-house. Bipolar amplifiers record pairs of wires via the low impedance (50 to 500 kOhms) gold wires that are cut across the tip to provide the low impedance recordings. These

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

5

connect to an FM transmitter operating in the carrier 35 to 55 MHz range. The amplifier has a gain of 100x and is filtered between 5 and 5,000 Hz. During recording sessions, a power induction coil powers the device with the induced current passing through a regulator to provide +/- 3 volts. The electronics is insulated with a polymer (Elvax: Ethylene Vinyl Acetate Copolymer Resin, from DuPont, Wilmington Delaware 19898) and further insulated (and protected against trauma) with Silastic (Med-6607, Nusil, Silicone Technology, Carpentaria, CA). The gold pin connection to the electrodes is protected with acrylic cement (Medtronic Inc., St. Paul, MN). The whole implant is covered with scalp skin. A lateral X-ray of the skull is shown in Figure 1B indicating that three of the eight pairs of electrodes wires were attached to three sets of connecting pins that are, in turn, attached to three electronic amplifiers and FM transmitters. Other wires were not connected as it was impossible to place more than three amplifier devices with power induction coils subcutaneously.

A Figure 1. (Continued)

Complimentary Contributor Copy

6

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

B JAW

TONGUE

LIPS

C Figure 1. (Continued)

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

7

D Figure 1. Methods of Implantation. A: Diagram of electrode with two wires. Two cones with a total of four wires were inserted in PK. As shown, neurons outside of the tip extend neurites into the implanted glass cone. The neurites become myelinated within three weeks [1]. B: Lateral skull X-ray of the subject PK, taken after the electrode implantation procedures. Four electrode tips were implanted 6 mm apart. Each electrode tip contained four wires. This view illustrates three pairs of wires (middle of X-ray) attached to three sets of connector pins. These pins are attached to three devices (too bulky to fit four or more devices) that are attached to three power induction coils (on right surface). C: Functional MRI axial slices showing the high blood flow (yellow areas) during jaw, tongue and lip articulations in subject PK. Note the bilateral increases and the dual representation of lip articulations. Each articulation was repeated 10 times and included jaw open and close, tongue out and back, and lip pout and grin. D: Surgical exposure shows the target above the Sylvian fissure and anterior to the central sulcus (upper panel). Nose is to the right, ear is above. The lower panel shows an electrode being implanted. All data in these figures are from subject PK.

Complimentary Contributor Copy

8

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

Implantation Target Site Prior to the implantation procedures, functional MRI was employed to localize areas of articulation in the subject (PK). Articulatory movements consisted of protrusion and retraction of the tongue, jaw closing and opening, and cheek grinning and pouting. These are bilaterally represented as shown in Figure 1C. The speech motor area is localized 3 cm medial to the Sylvian fissure in primary motor cortex. Note that operation of this prosthesis is a motor task since it is the neural signals associated with movement of the articulators that are recorded. This localization is similar to what Bouchard and Chang [30] found using ECOG recordings in humans. Their data also suggest that articulators are topographically laid out.

Surgery At the left sided craniotomy on June 21 2014, four neurotrophic electrodes were implanted 6 mm apart. The electronic components were subsequently implanted, in October 2014. Electrodes and electronics had to be removed in January 2015, due to dehiscence of the incision. Side effects after the first surgery included transient motor aphasia (loss of speech and writing ability), that began on day 2 postop, started to recover at day 5, and had almost resolved three weeks post-surgery.. In addition, there was brain swelling of the left hemisphere that resulted in mild weakness on the right side that recovered within a few weeks and resulting in a mild ataxia due to the weakness. A simple partial seizure occurred of the opposite face and jaw that was successfully treated with levetiracetam 500 mg bid followed by lacosamide 100 mg bid for a further 6 months. The subject also suffered mild anxiety but no depression. Following the October surgery there was atrophy of the left temporalis muscle secondary to compression of the first branch of the facial nerve over the zygomatic arch due to continuous retraction of the muscle during the surgery. The subject is now normal and fully returned to normal function. Figure 1D shows the target above the

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

9

Sylvian fissure and anterior to the central sulcus (upper panel) and an electrode being implanted (lower panel).

Recording Single channel FM transmitters were implanted at the time of the October 2014 second surgery. During recording sessions, a power coil and FM receiving coil were adhered securely to the scalp using C2 EEG paste (Natus Manufacturing, Gort, Ireland). Briefly, the implanted recording amplifiers have gains of 100x with a bandpass filter of 5 to 5,000 Hz [1, 7]. The FM receivers (WinRadio Inc, Oakleigh, Australia) are tuned to the FM frequencies in the range of 35 to 55 MHz. An external amplifier for each channel (BMA-200, CWE Inc. Ardmore, PA) has a gain of 100x, with a bandpass filter of 1 to 10,000 Hz. Continuous data streams are archived on a DDS digital tape recorder (20 KHz for the neural signal inputs, SCSI 16 channel, from Cygnus Inc., PA, USA.). Data are subsequently sent to the Cheetah software (Neuralynx, Bozeman, MT) and separated into two streams: The single units are filtered between 300 to 6,000 Hz, while the continuous data is filtered between 5 and 6,000 Hz. Artifacts can occur when the induction coils slip off target, or when they are too close to the underlying power receiving coil. Such artifacts were detected and rejected as described previously [7, 9, 13]. The impedances of electrode wires were measured when the electronics were implanted at the second surgery in subject PK, and at replacement of electronics in subject ER.

Paradigm The paradigm was to speak, overtly or covertly, a phoneme, word or phrase while recording from one of the three implanted electrodes and occasionally from two electrodes simultaneously in PK. During covert speech an event marker was pressed with the left hand. The left hand was preferred to the right so as to avoid contaminating the data if the

Complimentary Contributor Copy

10

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

contralateral right hand were to be used. An example paradigm is shown in Figure 2, lower panel.

Spike Sorting Cheetah Spike Sorting software (Neuralynx Inc., Boseman, MT) was used to sort the continuous data streams into single or multi- units, using a convex hull algorithm which uses parameters such as peak, valley, height or area under the curve of the presumptive spikes as shown in Figure 2. The program first classifies units into upward or downward deflecting potentials thus creating two channels of data, and then applies peak and valley parameters for unit separation. We have found that the potentials recorded with our system were stable from one recording session to the next, likely because they were recorded from the same neural elements with the same electrodes in the same topographic arrangement [16]. Once finalized, the spike sorting parameters are used across multiple recording sessions. The cluster parameters were only re-sorted when the recording system is disturbed (for instance, when electronic components are damaged and surgically replaced, which was required twice in ten years with subject ER). The gains of the implanted amplifiers vary from one amplifier to another. This merely increased or decreased the amplitudes of the signals thus making re-identification of the signals relatively simple.

A Figure 2. (Continued)

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

B

C

D

11

E

Figure 2. Recorded Data. A: Multi-unit activity from the Neurotrophic Electrode before cluster cutting. B: Two parameter layout of single units, for example, height versus width. By circling clouds of units, the Convex Hull technique excludes other units. C: Included units are displayed, obviously with more than one unit. Note the white cross where a cut was made. D: Units above and below the peak are excluded revealing a single unit. E: Multiple single units are displayed after cluster cutting. All data in these figures are from subject ER. Lower panel: The paradigm consisted of a continuous data stream associated with overt or covert speech associated with an event marker. In the case of covert speech, the event marker was helpful in deciding when speech began. The presence of Beta peaks also helped (see text).

To determine if the detected units were single or multi-units, interspike interval histograms (ISIHs) were helpful. Given the fact that axons (which are within the cone, there being no neurons) have a refractory period of about 1 ms, the presence of ISIs of about 1 ms suggested individual spikes.

Complimentary Contributor Copy

12

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

Frequency Domain Analysis Using fast Fourier transforms of the continuous signals also contributed to the results. Beta peaks have already been described in subjects implanted intra-cortically and externally and are defined as 12 to 20 Hz increases above baseline in a ratio of 100:1. In the present study, both the Beta peaks and the event markers are used to determine the onset of speech. Beta peak data has been published [14].

Detecting Single Phonemes Analysis focused on five phonemes that had the most single unit activity associated with them (Bead (‘eeh’), Chin (‘ch’), Go (‘guh’), Judge (‘juh’), Fine (‘feh’)). Single unit bursts (and consequent quiet periods) were assessed over the time in which the phoneme was being spoken overtly or covertly. The bursts were scored by visual examination according to their height and trough as plus or minus, independent of their duration or the amplitude of their height or trough. Scoring consisted of 1 = an increase or decrease in firing modulation; 2 = an increase followed by a decrease; 3 = a decrease followed by an increase; 4 = increase followed by a decrease followed by an increase; 5 = a decrease followed by an increase followed by a decrease. This analysis was completed for each of the 23 single units in electrode 3, for all 10 trials within each session and for each of the five phonemes (Bead, Chin, Go, Judge, Fine). The data were tabulated and the commonest scores [1-5] were noted for each single unit. These data formed a pattern of unique activity across all single units. Using these patterns of single unit firings, a proprietary software program was used to detect these patterns. The paradigm involved searching for average activity prior to and following the burst of activity. The amount of time allotted for this prior and post burst calculation was a time bin of 50, 20, 10, 5 or 1 ms. Results are described below in the results section.

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

13

Neural Net Analysis A simple Fitting architecture was used to map from a set of inputs to a corresponding set of outputs (Fitting function in Matlab, (Natwick, MA)). In the present usage, the inputs are patterns of neural bursts and the outputs are known or estimated patterns of neural bursts. The standard neural network architecture for fitting problems is a multilayer perceptron. The Tansig function is used in the middle layers that outputs to the next layer. The advantage of the Tansig function is that it is centered around zero and not always positive since its values vary between -1 and 1. The alternatively used Logsig transfer function is always positive and this would lose half the data because its values vary between 0 and 1. A hidden layer of 1 or 2 and sometimes 10 is usually all that is required. The number of layers is chosen by trial and error. Thus in our application, the initial step is to determine the likely single unit firing pattern by examining the burst patterns of the single units as described above. This typical pattern is designated as the target output. Standard patterns of firing for each word or phrase are then catalogued (which can be viewed as a ‘look up table’) and used for speech production as new neural firing inputs are transferred through the system. The algorithms used in the Fitting app were the standard Levenberg-Marquardt, Bayesian Regularization and Scaled Conjugate Gradient.

RESULTS Characteristics of Units Over 36 sessions in an 11 week period before removal of the electrode wires and electronics, recordings were made from one or concurrently from two electrodes, revealing a total of 65 single units consisting of 21, 21 and 23 single units from electrodes 1, 2 and 3 respectively. The spike sort parameter file was not changed from one recording session to another. Nor was it changed during off-line analysis. Examples are shown in Figure

Complimentary Contributor Copy

14

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

3A. To demonstrate that units were from single axons rather than multiple axons, interval histograms are used to indicate that single units will have a short gap between firings as shown in Figure 3B.

10 uV

0.5 ms

A

B Figure 3. (Continued)

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

15

C

D Figure 3. Characteristics of single unit data and the paradigm. A: Examples of single units cluster cut as just described. There are upper deflections and lower deflections from one channel of multi-unit data. In addition, the overall amplitudes are low. B: On day 1556 after implantation in patient ER, Inter Spike Interval Histograms (ISIHs) were constructed showing the ‘gap’ in firing rate close to zero time. This implies that the firing unit is single because two or more units would fill in this gap. This conclusion is valid for moderate to high firing units in the lower three rows, but not for slow firing units as shown in the two top rows. C: Examples of the gap in firing indicating that units are single from single axons. D: The figure shows 15 seconds of firing frequency with the large units firing the slowest and the smallest firing rapidly. Data in these Figures (3A-D) are from subject ER.

Complimentary Contributor Copy

16

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

Note that the non-trigger (or dead) time is 264 microseconds during which time the system cannot trigger another timestamp. This is implemented to avoid spikes too close together that could not realistically be distinguished. This has significant implications for interpretation of Inter Spike Interval Histograms (ISIHs). With the re-trigger time (or dead time) being a quarter ms, there will be a necessary quarter ms initial gap in the ISIH. Thus any gap more than a quarter ms implies a single unit. Examples of several single units is shown in Figure 3C. Note that low firing units will have gaps due to the low rate, thus invalidating the analysis for low firing units. The above analysis suggests that 65 single units were detected and confirmed to be highly likely single units recorded from the three implanted electrodes. Another finding in all patients, including other humans and monkeys already reported [3, 4, 13], was the difference in firing rate from one or no Hz for the largest amplitude units to rates near 100 Hz for small amplitude units, as shown in Figure 3D. Whereas low amplitude of units could be explained by being far away from the recording wires within the cone, this would not explain their fast rates. Our interpretation is that the small, fast units are axons grown from interneurons and the large slow firing units are axons grown from Betz cell that are projecting down the corticospinal tract. Histological analysis has shown no neurons inside the cone, so these signals must come from visualized axons [1, 2].

Impedance Measurements in both humans (ER, PK) during surgery to replace the electronics revealed impedances between the pairs of wires of 70 to 100 kohms measured at 1KHz. These are similar to measurements made previously in similarly implanted non-human primates [16].

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

17

Motor Modulations A

Modulations to Light Touch B

Modulations to Pin Prick C Figure 4. Motor and sensory relationships to single units. A: The wide range of movements is plotted against the number of units recorded from electrode 3. ‘Baseline’ indicates unit firings, if any, during no movements. B: Light touch with cotton tip indicating areas touched and the number of related units. Examination performed by an assistant. C: Pin prick data indicating areas touched and the number of related units. Examination performed by an assistant. All data in these figures are from subject PK.

Complimentary Contributor Copy

18

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

Single unit modulations during oropharyngeal movements: The movements consisted of jaw movements (up, down, right, left, in, and out), cheek movements (smile, pout, move right cheek, move left cheek), and tongue movements (in, out, roof front, roof back, floor front, floor back, touch outer gum, touch lower gum, curl inside, curl outside mouth, push into right cheek and push into left cheek). Laryngeal movements (down, up, and swallow) were also attempted but they were the least isolated movements. All articulations are shown in Figure 4A with respect to the number of units activated by those movements. Baseline recordings at rest consisting of no movements are also shown. These data indicate that we are most likely in the correct cortical area which is the articulatory motor area.

Single Unit Modulations to Sensory Stimulation Single unit modulation to light touch and pin prick were tested also with an assistant doing the testing. The results are shown in Figures 4B and 4C. These results confirm that the implant area target was correct. However, the responses to sharp and light touch involved the eye and ears which seems incongruous but is explained in the discussion section.

Topographic Relationship between Single Units and Phonemes There were differences between electrodes and related single units as shown in Figure 5A. The phonemes (listed in Table 1) are on the X axis with the vowels to left and consonants to the right. The number of related single units is on the Y axis. Note the highest modulations with electrode 3, the most lateral electrode. In addition, electrode 1 has a preponderance of vowel relationships whereas electrode 3 has a preponderance of consonant relationships. Note also that, when testing for covertly ‘spoken’ phonemes, there were some units that appear to be modulated by both overt and covert ‘speech’ (‘Doubles’ as shown in Figure 5B).

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

A Figure 5. (Continued)

Complimentary Contributor Copy

19

20

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

B Figure 5. Phoneme relationships to single units. A: This figure shows the number of single units per phoneme during overt speech. The 15 vowel phonemes are on the left of the X axis and the 24 consonants on the right. B: This figure shows the number of units related to phonemes during both overt and covert speech (Number of doubles) for the three electrode recordings. All data in these figures are from subject PK.

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

21

Table 1. Consonant phonemes and vowel phonemes

Using Single Units to Detect Phonemes Analysis was performed on the five phonemes with the most active single unit activity for overt speech. The highest scoring parameters were in 10 ms time bins (see methods). Baseline averaging was performed using 100 ms of data prior to and 100 ms post the onset of the burst. Different single units were modulated by overtly speaking different phonemes as shown in Table 2. Note the different phonemes (bead, go, chin, judge, fine) produced different patterns of single unit firings. The proprietary software program was successful in detecting these differences (shown next) for

Complimentary Contributor Copy

22

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

each electrode even though the differences were slight as can be seen in Table 1. Table 2. To simplify the comparative display, single units are renamed from 1 to 14 Active Single Units (renamed)

Detecting phoneme Bead is exemplified in Figures 6A, B and C using units from electrodes 1, 2 and 3. Using our in-house software program (Unipho), single units from Electrode 1 identified Bead (‘eeh’ blue bars) in 9 of 10 trials at the 200 ms mark (and 7 out of 10 at the 100 ms mark). Electrode 2 however, only identified Bead 5 out of 10 trials at the 200 ms mark. And Electrode 3 identified Bead only 1 out of 10 trials at the 200 ms mark. Thus Bead was identified best using single units from electrode 1. Another example: Single units from electrode 2 identified phoneme Fine (‘feh’ green bars) 10 out of 10 trials at the 50 ms mark and 9 out of 10 trials at the 100 ms mark as shown in Figure 6B. But Electrode 3 only identified Fine 3 out of 10 trials at the 100 ms mark (Figure 6C). Thus, single units from different electrodes can reliably identify different phonemes. It is remarkable that the electrode placements are closely spaced by only 6 mm. This identification alludes to the topographic arrangement of phoneme representation demonstrated in the ECOG data from Cheng’s lab [30]. Most important is the finding that identification of the phonemes occurred within 100 to 200ms after speech onset. This is fast enough for speech to be at a near-conversational rate, in other words, not slurred.

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

A

B Figure 6. (Continued)

Complimentary Contributor Copy

23

24

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

C Figure 6. Phoneme identification using proprietary software program. A: Phoneme identification using single units from electrode 1. The Y axis shows number of trials and the X axis shows the time (ms) after onset of speech. B: Phoneme identification using single units from electrode 2. Ditto. C: Phoneme identification using single units from electrode 3. Ditto.

Overt and Covert Speech Despite success with the Unipho software paradigm in identifying phonemes the program has a serious drawback. The major problem is that even though the time interval required to perform the identification was within the range of normal speech, namely, 200 ms, it required extensive and time consuming off line analysis which renders it useless for online near-real time identification of phonemes, words and phrases. An alternative is the use of artificial neural nets. This approach was evaluated,

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

25

using the Neural Net Fitting program from Matlab as an example. This program was used with averaged burst data, individual burst data and single unit data. Burst data is defined above. This analysis suggest that individual burst data is the optimal data base.

Averaged Burst Data The example used in the following analyses is “HELLO WORLD” spoken overtly and covertly. The average burst patterns of firing are shown and defined in Table 3. These averaged patterns (0 to 5 firing patterns) are designated the targets for the Fitting app. For testing purposes, the inputs were (i) randomly distributed numbers from the targets pattern, (ii) randomly distributed numbers that are not identical to the target numbers (such as 2.5 instead of 2 from example), (iii) very different numbers randomly distributed and (iv) as the ultimate control, the same number repeated (such as 9,9,9 etc.). In all cases except (iv), after 1 or 2 hidden layers were used, the regression R = 1 (left upper panel) and the error histogram were also acceptable as shown in Figure 7 for overt (left upper panel) and covert speech (right upper panel). The R = 1 (left lower panel) for covert data analysis. In the control case (iv), the R hovered around 0.84 to 0.86 even when 10 hidden layers were used (right lower panel). Note that there was no variation in the percentage of data presented for training (70%), the validation (15%) and the testing (15%). During the control run the same parameters were used, and then to improve the result we used not only 10 hidden layers but the training was increased to 80%, validation decreased to 10%, and testing to 10%. The R was still only = 0.86. Even with other manipulations of these percentages no improvement was seen in the result. During covert speech, the Beta peak (12 to 20 Hz in the frequency domain) was used as the guide to the covert speech onset. The Levenberg-Marquardt was essentially similar to the Bayesian Regularization and Scaled Conjugate Gradient software engines used in the Fitting program.

Complimentary Contributor Copy

26

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al. Table 3. Burst patterns during overt and covert speech

Average burst patterns: 0 = no burst; 1 = an increase or decrease of firing from baseline; 2 = an increase followed by a decrease to baseline; 3 = a decrease to baseline followed by an increase; 4 = an increase followed by a decrease followed by an increase; 5 = a decrease followed by an increase followed by a decrease. The word ‘WORLD’ has a more similar burst pattern for overt and covert speech than ‘HEL’ or ‘LO’

Figure 7. (Continued)

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

Figure 7. (Continued)

Complimentary Contributor Copy

27

28

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

Figure 7. Error histograms and regression analyses of overt and covert speech using the Fitting app. Left upper panel: Error histogram for overt data. Data are for averaged bursts as defined in Table 3. Right upper panel: Error histogram for covert data. Note the different Y axis scales. Left lower panel: Example of regression analysis for Levenberg-Marquardt training on covert data showing R = 1. Right lower panel: Control data with R = 0.866 which is unacceptable compared to an R = 1 for other parameters.

Individual Burst Data Averaged burst data is obviously not realistic when it comes to tracking covert speech at a near-conversational rate. Individual burst data in covert speech has to be analyzed. Data for the same phrase “HELLO WORLD” was used. To set up the Target data one covert trial of the phrase was used. There were 10 trials and whichever one seemed most typical and most consistent with the other trials was chosen as the Target data. The results for the Levenberg-Marquardt (LM) and Bayesian Regularization (BR) used in the Neural Net Fitting program are shown in Figures 8 A & B when 9 trials were individually compared to trial 2.

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

29

A

B Figure 8. Regression analyses of covert speech trials using the Fitting app. Fitting neural net app using Levenberg-Marquardt and Bayesian Regularization paradigms to test the regression of the input to the target data. Data are for individual bursts of covert data.

Complimentary Contributor Copy

30

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

A

B Figure 9. A and B: Error histograms for control and test trials during covert speech. Error histograms are shown for covert phrase control in A (target versus target) and one trial 1 v 2 (data versus target) in B.

Complimentary Contributor Copy

Advances in the Development of a Speech Prosthesis

31

There were 27 bursts in each trial and beta peaks (detected by using FFT of the continuous data stream) were used to trigger the analysis starting point for the covert speech data. The R values vary as expected with some reaching .8 to .9. The Scaled Conjugate Gradient paradigm produced lower R values so it is not shown. The values depicted are optimal values, since retraining can increase or decrease the R values. Also, the LM paradigm produced the most rapid analysis (once the neural net had been trained) with the program doing only 3 to 4 iterations with timing measured at a few hundred ms (the system does not measure below 1s so an external timing device was used). The BR paradigm was slower, in the order of about 1 second though with retraining it was speedier. As a control, the target was trained against itself (same input and target data) and produced R values close to 1. In Figure 9A and B, the error histograms are shown. A indicates the minimum error with the control (target versus target), and B shows the error when trial 1 is compared with the Target (trial 2). Thus, even for covert data, the phrase ‘Hello World’ can be detected using only single data streams that have been grouped into bursts according to Table 3.

Single Unit Data The third step in determining the utility of data presentation to the Fitting program was to take raw single units, not bursts, and use trial 2 of covert data in HELLO WORLD as the target data. (As above for the burst data analysis.) Neural recording of HELLO WORLD took 2 seconds (followed by at least 1 sec of rest). So with the single units grouped into 5 ms time bins we used 400 bins for 2 seconds and used a beta peak from continuous data [14] as the starting point for the analysis of covert data. The target episode was first compared to itself as a control (R = 1 after one retraining episode, and the error histogram was optimized). Then it was compared to other episodes of covert speaking using the Fitting app. The same procedure was performed on overt data. The R values were in the low 0.4 to 0.5 range which is not acceptable as shown below.

Complimentary Contributor Copy

32

P. R. Kennedy, A. J. Cervantes, C. Gambrell et al.

Results of Individual Burst Data and Single Unit Data The difference between regressions values (R) of overt and covert speech for individual bursts was not significant (P