Serving Macomb, Oakland & Wayne Counties

Voice-To-Text Technology for Patients with Hearing Loss

Kader, Sara Esther AuD, CCC/A; Eckert, Anne M. AuD, MBA, CCC-A; Gural-Toth, Virginia AuD, CCC-A

doi: 10.1097/01.HJ.0000734212.09840.d7

Audiologists serve in several roles as health care providers, including staying current with the ever-changing landscape of hearing aid technology. Consumer technology is also ever-changing, and interfacing these changes with hearing aid technology is an additional responsibility of audiologists. Integrating new consumer technologies can help patients decrease their activity limitations and promote their participation in various areas of life. One such technology is voice-to-text, which transcribes speech into written words. Let’s examine the benefits of voice-to-text technology, its historical context, and important clinical implications for its use including sensitivity and privacy issues.


Voice-to-text technology emerged in 1952 and continues to evolve (see sidebar timeline). It was initially used for simple, daily tasks like verbal responses on telephone trees and dictation services. Early systems only recognized limited words and those usually spoken by limited voices. As systems developed, speech prediction models improved the accuracy and speed of the system; however, systems did not understand speech spoken at a natural rate, and many errors persisted.

In 2020, this technology is widely used for individuals working away from the traditional desktop wanting to dictate notes or ideas. Even those working at a desk can find the technology helpful if they think or talk faster than they write.  Most recently, Apple and Google have capitalized on cloud-based data to improve accuracy and speed, increasing user efficiency. Apps often save the transcription in cloud-based storage for easy access via multiple platforms. Some feature considerations include accuracy, shortcuts, languages available, and technical jargon capabilities. Among the many programs are available, Dragon excels in accuracy for transcription and is often chosen for professional applications. Google Assistant allows Google users to use voice commands and transcribe voice to text. Speechnotes is an Android transcription application (app) that allows for pauses to think and punctuation. For shorter notes, Voice Notes is another option. Apple offers iTranslate Converse for real-time translation of conversations and recognizes 38 languages. Users looking to minimize keyboard use can try Evernote or Braina. These apps convert voice to text for any software program. Braina does particularly well with jargon and can be used in both Apple and Android devices. Call Recording by NoNotes can be used to record and transcribe phone calls. The technology has been primarily marketed for workplace use but also shows significant potential as a communication tool for individuals with hearing loss.

Specifically, several voice-to-text apps have emerged as particularly helpful to individuals with hearing loss. These apps essentially provide live captioning for conversations. Examples are Ava for iOS, Google’s Live Transcribe for Android, and Otter AI that works with both platforms. In early 2020, iOS also released a Live Transcribe app which is not a Google product but offers many of the same features. These apps enable hard-of-hearing individuals to access conversations by reading text provided by the app on their smartphones. Each app has minor advantages in display, accuracy, and ease of use. Consumers may benefit by comparing app features to determine which one works best for them.


Voice-to-text technology can be invaluable for individuals with hearing loss. Advances in hearing technology and cochlear implantation (CI) have provided aural/oral communication opportunities for patients that in previous times would have relied solely on visual modes of communication. Today, many patients, even those with unaided audiograms showing severe to profound hearing loss or poor word recognition scores, have conversations using audition. These patients may understand part or most of the conversation using audition and might use context clues or other auditory closure skills to improve understanding. However, other patients continue to rely on visual inputs to supplement hearing.

There are many patient profiles for whom voice-to-text technology could be helpful. For example, patients who would qualify for a CI but are not yet implanted or patients who qualify for CI but are not pursuing implantation due to contraindications or personal choice. Other patients generally hear “well enough” but want to be certain not to miss any words of speech in critical conversations.

Environmental factors can conspire to confound the patient who generally hears “well enough” but is having difficulty in a specific situation. Noisy situations like crowded restaurants leave many gaps that the patient must close. Darkness is another environmental factor that may prevent access to speechreading cues. Voice-to-text technology could be a valuable tool in these situations.

Historically, visual inputs, including sign language, speechreading cues, closed captioning, and Communication Real-Time Translation (CART), have been used to supplement or replace hearing. However, each of these methods also comes with challenges. Some patients have difficulty hearing the television but are unable to use closed captioning because the captions may be too small or too far away. Voice-to-text technology could allow these patients to access captions that they can enlarge.

Patients who use a manual form of communication have the right to an interpreter in many situations. Nonetheless, there are times when there is no interpreter available and patients need to communicate with someone who does not sign. Using a voice-to-text app instead of an interpreter would work particularly well for short exchanges, for example in a retail establishment.

Individuals relying on speechreading have long reported their frustration in medical environments like surgical areas when masks prevented visualization of the speaker’s face. Amid the COVID-19 pandemic, the widespread use of masks in the general community presents a new obstacle for individuals relying on speechreading. This technology can serve an important role in improving communication at a time when access to speechreading is limited or, in many cases, eliminated.


Audiologists can incorporate these technologies into clinical practice in several ways. First, empower your patients. Technologies are constantly emerging. Patients may have tried a voice to text app previously and been disappointed. Let your patients know that it may be time to try again. As with many technologies, some patients need to get used to the idea of something new and relinquish the idea that what has previously worked is no longer effective. Demonstrating the product often speaks louder (pun intended) than an abstract discussion. Sometimes, it may be the caregiver that may need the information or may be savvier in installing and initiating the use of the app. Adding this information to your toolbox of assistive listening technologies will enhance the professional’s image as a technology provider aware of the latest benefits available.


In counseling patients about this technology, careful consideration of the patient’s perspective on communication is critical. For individuals who are congenitally deaf and learned language through an auditory/aural approach suggesting the use of an app that uses visual cues may be insulting to them as their early training was focused on avoiding visual cues. A discussion of areas of communication difficulty will set the stage for providing a solution that would be acceptable to the patient.

This technology certainly has many benefits but there are some important factors to consider. First, consider patient privacy. Does the app save conversations on the cloud and could patient health information be unwittingly shared? If information is being stored on the cloud, HIPAA requires a Business Associates Agreement (BAA) with the healthcare organization that stipulates compliance with HIPAA rules. A BAA provides legal protection for a patient’s private information. Is the data encrypted to protect a patient’s privacy? Some cloud services do not provide BAAs and others don’t encrypt data.

What if the patient was using his or her own device to use an app? Some organizations have policies preventing patients from recording interactions with providers. Would this be considered a recording? Furthermore, how would the use of this tool during a visit be documented?

While it would be beyond the scope of this article and beyond the expertise of this author to provide legal advice, consultation with legal counsel before using this type of technology in the office should be strongly considered.

Voice-to-text technology has come of age. The technology is reliable, convenient, and inexpensive allowing many hearing aid users access to conversations that were previously inaccessible. While some patients will use this type of technology regularly, others may use it only in specific situations. Audiologists are well poised to educate patients about the benefits and limitations of this new technology, but consultation with legal advisors is suggested to ensure compliance with privacy laws before using the technology during patient visits.

Copyright © 2021 Wolters Kluwer Health, Inc. All rights reserved.