Researchers examine the significance of clear pronunciation when utilizing speech-to-text technology in medical contexts.
Speech-to-text technology is increasingly being used for various everyday tasks, including hands-free dictation, aiding individuals with visual impairments, and generating transcriptions for people who are hard of hearing. These tools serve a variety of purposes, and researcher Bożena Kostek from Gdańsk University of Technology is investigating how speech-to-text (STT) can be leveraged more effectively within the healthcare sector. By analyzing the impact of clear articulation on the accuracy of STT, she aims to enhance its applicability for medical professionals.
“Automating the process of taking notes for patient information is essential for physicians and radiologists because it enables them to spend more quality time with patients and enhances data gathering,” explains Kostek.
She also discusses the obstacles faced in their research.
“STT systems often encounter difficulties with medical terminology, particularly in Polish, as many models have been mainly trained using English data. Furthermore, most resources cater to basic language usage and not specialized medical jargon. Noisy environments in hospitals compound the issue, as healthcare workers might not articulate clearly due to stress or various distractions,” she added.
To address these challenges, a comprehensive audio dataset was developed, featuring Polish medical terminology articulated by doctors and specialists across fields such as cardiology and pulmonology. This dataset underwent analysis with an Automatic Speech Recognition (ASR) model, which transforms spoken language into written text. Various metrics, including Word Error Rate and Character Error Rate, were employed to assess the efficacy of the speech recognition process. This analysis provides insights into how the clarity and style of speech influence STT accuracy.
Kostek is scheduled to share these findings on Thursday, Nov. 21, at 3:25 p.m. ET during the virtual 187th Meeting of the Acoustical Society of America, taking place from Nov. 18-22, 2024.
“Medical terminology can be challenging, particularly with abbreviations that vary between different specializations. This task becomes even more complicated when considering realistic hospital scenarios, where the acoustics of the room are not optimized,” Kostek noted.
At present, the research focuses on the Polish language, but there are intentions to extend the study to additional languages, including Czech. Partnerships are being formed with the University Hospital in Brno to create resources for medical terminology, with the goal of enhancing the use of STT technology in healthcare.
“While artificial intelligence is beneficial in numerous circumstances, it is essential to investigate many issues analytically rather than taking a broad approach, emphasizing the division of the overarching picture into distinct elements.”