Now Pixel’s Gboard With On-device Voice Dictation

Published 2019-03-13

Gboard is a Google powered keyboard solution for Android devices. Most of the features depend on machine learning, artificial intelligence and neural networks. One of the features is speech to text or voice dictation. Earlier, when you speak to your device, it sends your voice to the cloud, wherein it is handled, and converted into text and then it is sent back to your device. But now Google is transforming the method.

What Is Changed?

Google has now decided to use on-device machine learning algorithms as it provides availability and low latency. Therefore, Google has announced that Gboard, the virtual keyboard app available for multiple platforms will now use an end-to-end recognizer to influence American English speech input on Google Pixel smartphones.

Image Source: venturebeat

The tech giant disclosed, now new Gboard will have an all-neural, on-device Gboard speech recognizer that assures to enhance your speed recognition. The new recognizer utilizes RNN-T aka (recurrent neural network transducer) which can easily live on your phone, this means, no network latency as you don’t use network for your transcription. As it resides on your phone, the enhanced speech recognizer operates with internet connection as well.

RNN-T trained on second-generation TPUs (tensor processing unit) in Google Cloud, therefore can handle real-time transcription. Also, as per Google because of its conceived training technique, it is 5% less prone to mistake words while transcription.

Google also mentioned that its enhanced Gboard voice recognition works at the character level. In other words, whatever you speak, the output appears character by character as if someone typed the words when you spoke in real time.

What Google Has To Say?

A Google’s Speech Team member mentioned in a blog post, “This means no more network latency or spottiness — the new recognizer is always available, even when you are offline. The model works at the character level, so that as you speak, it outputs words character-by-character, just as if someone was typing out what you say in real-time, and exactly as you’d expect from a keyboard dictation system.”

Schalkwyk also said, “Given the trends in the industry, with the convergence of specialized hardware and algorithmic improvements, we are hopeful that the techniques presented here can soon be adopted in more languages and across broader domains of application.”

When Is It Coming?

As per the news, Google has announced that this all-neural on-device Gboard speech recognizer is coming to all Pixel phones in American English for now. But Google has also given hope to extend to other languages soon.

What Are the Advantages?

The new technique will eliminate the need of internet connection. Earlier, you need to have mobile data or Wi-Fi to send or retrieve information to and from cloud. The offline processing method has also decreased latency and misleading issues.