Microsoft Hits a Speech Recognition Milestone

vgane · Aug 24, 2017

Microsoft Hits a Speech Recognition Milestone

The company's technology now equals human transcribers and that might make your devices understand you better.

Daniel B. Kline

(TMFDankline)

Aug 22, 2017 at 8:43AM

One of the biggest problems facing voice assistants like Amazon (NASDAQ: AMZN) Echo, Apple (NASDAQ: AAPL) Siri, and Microsoft (NASDAQ:MSFT) Cortana is their struggles with conversational speech recognition. It's one thing to slowly speak into your phone, device, or computer to ask something simple. Speaking conversationally as you normally do and having your artificial intelligence (AI) voice assistant help is something entirely different.
Microsoft has been one of the companies trying to improve the language ability of AI-powered devices. Last year, it was able to achieve the same error rate as human transcribers using a standard test known as Switchboard. Now the company has further improved its speech recognition, equaling an even tougher method of being compared with professional transcribers.

Improving the machines

Last year, Microsoft reported that its transcription system reached the 5.9% word error rate that it determined was where human transcribers scored. A second group of researchers, using "a more involved multi-transcriber process," achieved a 5.1% error rate for humans, according to an Aug. 20 blog post by Microsoft Technical Fellow Xuedong Huang.
Microsoft's software has now equaled those results, according to the company. Huang gave some background:

We reduced our error rate by about 12% compared to last year's accuracy level, using a series of improvements to our neural net-based acoustic and language models. We introduced an additional CNN-BLSTM (convolutional neural network combined with bidirectional long-short-term memory) model for improved acoustic modeling. Additionally, our approach to combine predictions from multiple acoustic models now does so at both the frame/senone and word levels.

Microsoft used Switchboard, a collection of recorded telephone conversations that has been used by the speech recognition community for more than two decades, as its benchmark. Tests using the system involve transcribing conversations between strangers discussing topics such as sports and politics.

https://www.fool.com/investing/2017/08/22/microsoft-hits-a-speech-recognition-milestone.aspx

Microsoft Hits a Speech Recognition Milestone

vgane

0

Share this page

Share this page

Latest posts

Latest ads

Follow us on Social Media