A subdural neuroprosthesis using deep-learning and natural-language models successfully decoded words and sentences directly from cortical activity in a patient with anarthria and limb paralysis, researchers reported.
Anarthria, or the loss of the ability to articulate speech, hinders communication with friends, family, and caregivers, and thus has a major impact on quality of life among patients with paralysis. While advances have been made in the field of typing-based brain–computer interfaces that allow patients to type out messages via neural activity, the letter-by-letter selection required by these devices is slow and requires a great deal of effort. But what if a brain–computer interface could pull whole words directly from the brain areas responsible for controlling speech?
In this single-patient analysis, David A. Moses, PhD, of the department of neurological surgery at the Weill Institute for Neuroscience, and colleagues, assessed real-time decoding of words and sentences from the cortical activity of a patient with spastic quadriparesis and anarthria caused by a brain-stem stroke.
Their findings were published in The New England Journal of Medicine.
“We showed that high-density recordings of cortical activity in the speech-production area of the sensorimotor cortex of an anarthric and paralyzed person can be used to decode full words and sentences in real time,” they wrote. “Our deep-learning models were able to use the participant’s neural activity to detect and classify his attempts to produce words from a 50-word set, and we could use these models, together with language-modeling techniques, to decode a variety of meaningful sentences. Our models, enabled by the long-term stability of recordings from the implanted device, could use data accumulated throughout the 81-week study period to improve decoding performance when evaluating data recorded near the end of the study.”
And, while previous research found that the decoding models in most brain–computer interface applications require daily recalibration before use, in this analysis, “decoding performance was maintained or improved by the accumulation of large quantities of training data over time without daily recalibration, which suggests that high-density electrocorticography may be suitable for long-term direct-speech neuroprosthetic applications.”
This analysis was part of the BCI Restoration of Arm and Voice (BRAVO) study, a single-institution clinical study designed to “evaluate the potential of electrocorticography, a method for recording neural activity from the cerebral cortex with the use of electrodes placed on the surface of the cerebral hemisphere, and custom decoding techniques to enable communication and mobility,” Moses and colleagues explained.
Thus far, the investigational device assessed in this study—which combines a high-density electrocorticography electrode array and a percutaneous connector—has only been implanted in one patient, a 36-year-old man with limb paralysis and anarthria for 16 years prior to the study. The patient’s cognition was intact with a Mini-Mental State Examination score of 26 (scores ranging from 0-30, with higher scores indicating better mental performance).
The neuroprosthetic device’s electrode array was surgically implanted on the pial surface of the brain in the subdural space; the percutaneous connector was placed extra-cranially on the contralateral skull convexity and anchored to the cranium. “This percutaneous connector conducts cortical signals from the implanted electrode array through externally accessible contacts to a detachable digital link and cable, enabling transmission of the acquired brain activity to a computer,” the study authors wrote.
The study consisted of 50 sessions conducted over 81 weeks, during which the participant engaged in an isolated-word task and a sentence task:
- In the isolated-word task, he attempted to produce individual words from a set of 50 English words; in each trial, the patient was presented with a word, and then, after a two-second delay, he attempted to produce that word when the text of the word on the screen turned green. Moses and colleagues collected 22 hours of data from 9,800 trials of the isolated-word task in the first 48 of the 50 study sessions.
- In the sentence task, the participant tried to produce word sequences from a set of 50 English sentences (consisting of words from the 50-word set); in each trial, he was presented with a target sentence and attempted to produce those words in that order at the fastest speed he could comfortably perform. Moses and colleagues collected data from 250 trials of the sentence task in seven of the final eight study sessions.
The study authors used neural activity data from these tasks to “train, fine-tune, and evaluate” custom speech-detection and word-classification models that used deep-learning techniques to make predictions. They also used a natural-language model and a Viterbi decoder to decode sentences from the participant’s brain activity in real-time during the sentence task.
“We decoded sentences from the participant’s cortical activity in real time at a median rate of 15.2 words per minute, with a median word error rate of 25.6%,” Moses and colleagues reported. “In post hoc analyses, we detected 98% of the attempts by the participant to produce individual words, and we classified words with 47.1% accuracy using cortical signals that were stable throughout the 81-week study period.”
The study authors noted that incorporating language-modeling techniques in their study “reduced the median word error rate by 35 percentage points and enabled perfect decoding in more than half the sentence trials… These results show the benefit of integrating linguistic information when decoding speech from neural recordings. Speech-decoding approaches generally become usable at word error rates below 30%, which suggests that our approach may be applicable in other clinical settings.”
A subdural neuroprosthesis using deep-learning and natural-language models successfully decoded words and sentences directly from cortical activity in a patient with anarthria and limb paralysis.
While previous research found that the decoding models in most brain–computer interface applications require daily recalibration before use, in this analysis, decoding performance was maintained or improved by the accumulation of large quantities of training data over time without daily recalibration, which suggests that high-density electrocorticography may be suitable for long-term direct-speech neuroprosthetic applications.
John McKenna, Associate Editor, BreakingMED™
Moses reported grants from Facebook, grants from NIH, grants from William K. Bowes, Jr. Foundation, grants from Howard Hughes Medical Institute, and grants from Shuri and Kay Curci Foundation during the conduct of the study; grants from Facebook, outside the submitted work; and a patent for a method of contextual speech decoding from the brain (Application PCT/US2020/043706) pending.
Cat ID: 925
Topic ID: 915,925,730,130,192,925