Microsoft Hits a Speech Recognition Milestone
The company's technology now equals human transcribers and that might make your devices understand you better.
Daniel B. Kline
(TMFDankline)
Aug 22, 2017 at 8:43AM
One of the biggest problems facing voice assistants like Amazon (NASDAQ: AMZN) Echo, Apple (NASDAQ: AAPL) Siri, and Microsoft (NASDAQ:MSFT) Cortana is their struggles with conversational speech recognition. It's one thing to slowly speak into your phone, device, or computer to ask something simple. Speaking conversationally as you normally do and having your artificial intelligence (AI) voice assistant help is something entirely different.
Microsoft has been one of the companies trying to improve the language ability of AI-powered devices. Last year, it was able to achieve the same error rate as human transcribers using a standard test known as Switchboard. Now the company has further improved its speech recognition, equaling an even tougher method of being compared with professional transcribers.
Improving the machines
Last year, Microsoft reported that its transcription system reached the 5.9% word error rate that it determined was where human transcribers scored. A second group of researchers, using "a more involved multi-transcriber process," achieved a 5.1% error rate for humans, according to an Aug. 20 blog post by Microsoft Technical Fellow Xuedong Huang.
Microsoft's software has now equaled those results, according to the company. Huang gave some background:
https://www.fool.com/investing/2017/08/22/microsoft-hits-a-speech-recognition-milestone.aspx
The company's technology now equals human transcribers and that might make your devices understand you better.
Daniel B. Kline
(TMFDankline)
Aug 22, 2017 at 8:43AM
One of the biggest problems facing voice assistants like Amazon (NASDAQ: AMZN) Echo, Apple (NASDAQ: AAPL) Siri, and Microsoft (NASDAQ:MSFT) Cortana is their struggles with conversational speech recognition. It's one thing to slowly speak into your phone, device, or computer to ask something simple. Speaking conversationally as you normally do and having your artificial intelligence (AI) voice assistant help is something entirely different.
Microsoft has been one of the companies trying to improve the language ability of AI-powered devices. Last year, it was able to achieve the same error rate as human transcribers using a standard test known as Switchboard. Now the company has further improved its speech recognition, equaling an even tougher method of being compared with professional transcribers.
Improving the machines
Last year, Microsoft reported that its transcription system reached the 5.9% word error rate that it determined was where human transcribers scored. A second group of researchers, using "a more involved multi-transcriber process," achieved a 5.1% error rate for humans, according to an Aug. 20 blog post by Microsoft Technical Fellow Xuedong Huang.
Microsoft's software has now equaled those results, according to the company. Huang gave some background:
We reduced our error rate by about 12% compared to last year's accuracy level, using a series of improvements to our neural net-based acoustic and language models. We introduced an additional CNN-BLSTM (convolutional neural network combined with bidirectional long-short-term memory) model for improved acoustic modeling. Additionally, our approach to combine predictions from multiple acoustic models now does so at both the frame/senone and word levels.
Microsoft used Switchboard, a collection of recorded telephone conversations that has been used by the speech recognition community for more than two decades, as its benchmark. Tests using the system involve transcribing conversations between strangers discussing topics such as sports and politics.
https://www.fool.com/investing/2017/08/22/microsoft-hits-a-speech-recognition-milestone.aspx
Last edited: