Amazon’s Alexa voice assistant faces a massive challenge: Operating not only as a multi-lingual product, but also ensuring that all regional variants of languages it supports are well understood by Alexa, too.
To help accomplish that, Alexa has been retrained entirely for every variant needed – a time- and resource-heavy activity. But a new machine learning-based method for training speech recognition created by Alexa’s AI team could mean a lot less rework in building out models for new variants of existing languages.
In a paper presented to the North American Chapter of the Association for Computational Linguistics, Amazon Alexa AI Senior Applied Science Manager Young-Bum Kim and his colleagues laid out a new system that was able to demonstrate improvements in accuracy of 18 percent, 43 percent, 115 percent and 57 percent respectively on four variants of English (from the U.S., the U.K., India and Canada) used in the trial.
The team managed this by implementing a means through which it can tweak its learning algorithm to focus its attention more heavily on just a locale-specific model when it knows in advance that answers to requests from users made in that domain are highly-region specific (ie., when asking to find a good nearby restaurant) vs. when the results are going to be relatively similar regardless of where the request is being made.
Alexa’s team then combined their locale-specific models into one and also added in their location-independent model for the language, and found the improvements measured above.
Basically, this means they can save work by leveraging a common base and only focusing on adding differentiation for stuff that changes significantly in terms of what kind of answers it’ll prompt Alexa to give region-to-region, which should make Alexa smarter, faster and more linguistically flexible over time.