Listening to individuals diagnosed with Parkinson’s disease has enhanced the accuracy of an automatic speech recognizer by 30%, a finding from a recent study reveals.
During his analysis of data for his latest research, Mark Hasegawa-Johnson stumbled upon a delightful surprise—a recipe for Eggs Florentine. He noted that sifting through hundreds of hours of speech recordings often yields unexpected treasures.
Hasegawa-Johnson is at the helm of the Speech Accessibility Project, a University of Illinois Urbana-Champaign initiative aimed at improving voice recognition technologies for individuals with speech disabilities.
In the project’s inaugural study, researchers tasked an automatic speech recognizer with listening to 151 hours (almost six and a half days) of audio from individuals with speech disabilities caused by Parkinson’s disease. Their model demonstrated a transcription accuracy that was 30% higher than a control model that had not trained on similar speech patterns from Parkinson’s patients.
This research has been published in the Journal of Speech, Language, and Hearing Research. The speech recordings gathered during the study are openly accessible to researchers, nonprofits, and tech companies seeking to enhance their voice recognition systems.
“Our findings indicate that a comprehensive database of atypical speech can markedly enhance speech technologies for people with disabilities,” stated Hasegawa-Johnson, who is also a professor of electrical and computer engineering at Illinois and a researcher at the university’s Beckman Institute for Advanced Science and Technology, where the initiative is based. “I am eager to see how other organizations will utilize this data to make voice recognition technology more inclusive.”
Devices such as smartphones and virtual assistants rely on automatic speech recognition to interpret spoken language, allowing users to create playlists, send hands-free messages, participate in virtual meetings, and communicate effectively with friends and family.
However, voice recognition technology often struggles to comprehend everyone, particularly those suffering from neuromotor disorders like Parkinson’s, which can lead to various speech challenges—collectively known as dysarthria—characterized by strained, slurred, or uncoordinated speech sounds.
“This unfortunate reality means that many individuals who would benefit the most from voice-controlled devices may find them the hardest to use effectively,” explained Hasegawa-Johnson.
“Existing research shows that training an ASR on a specific person’s voice can lead to better understanding. We wondered: Could we train an automatic speech recognizer to decipher the speech of individuals with dysarthria caused by Parkinson’s by exposing it to a small group of similar speakers?”
Hasegawa-Johnson and his team gathered around 250 adults with a range of dysarthria associated with Parkinson’s disease. Prior to participating, prospective subjects were assessed for eligibility by a speech-language pathologist.
“Many individuals who have faced a communication disorder for an extended period, particularly a progressive one, may withdraw from everyday conversation,” said Clarion Mendes, one of the speech-language pathologists involved in the project. “They might feel discouraged from sharing their thoughts and ideas, believing their ability to communicate effectively is too compromised.”
“Those are the very individuals we’re aiming to help,” she added.
The selected participants recorded their voices using personal computers and smartphones. They could work at their own pace and could request help from a caregiver if needed, repeating common vocal commands like “Set an alarm,” reading excerpts from books, and responding to open-ended prompts such as “Describe how to make breakfast for four people.”
In one instance, a participant detailed the process for preparing Eggs Florentine, complete with Hollandaise sauce, while another suggested simply ordering takeout.
“We’ve heard from many participants who found the experience enjoyable and felt it boosted their confidence in communicating with family once more,” Mendes reported. “This initiative has sparked hope, excitement, and energy—qualities that are fundamentally human—in many of our participants and their loved ones.”
She mentioned that the team collaborated with experts in Parkinson’s disease and community members to craft relevant content for participants. The prompts were intended to reflect daily life: for instance, adding medication names to the training data could improve communication with pharmacies, and casual conversation prompts mirrored everyday chat.
“We inform participants that while they can clarify their speech by exerting considerable effort, they may be exhausted from trying to be understood for others’ sake. We encourage them to relax and communicate as if they were casually chatting with family on the couch,” Mendes said.
To evaluate the speech algorithm’s effectiveness in understanding and learning, the researchers organized the recordings into three sets. The first set, consisting of 190 participants or 151 hours of audio, was used to train the model. As it improved, the researchers verified that the model was authentically learning (rather than just memorizing responses) by presenting it with a second, smaller set of recordings. Once the model excelled with the second set, it was assessed using a third test set.
The research team manually transcribed approximately 400 recordings for each participant to ensure the model’s accuracy.
After training on the initial set, the ASR system transcribed the test set recordings with a word error rate of 23.69%. In contrast, a model trained on recordings from individuals without Parkinson’s disease had a word error rate of 36.3%—indicating it was roughly 30% less accurate.
Error rates decreased for nearly all individuals in the test set. Even those with atypical speech patterns associated with Parkinson’s, such as unusually fast speech or stuttering, showed slight improvements.
“I was thrilled to observe such a substantial benefit,” Hasegawa-Johnson commented.
His excitement is further supported by feedback from participants:
“I spoke with a participant who was enthusiastic about the future of this technology,” he noted. “That’s the remarkable aspect of this project: witnessing how hopeful individuals become about the potential that their smart devices and smartphones will understand them better. That’s truly our goal.”