satyamkapoor

I work at ValueFirst Digital Media Private Ltd. I am a Product Marketer in the Surbo Team. Surbo is Chatbot Generator Platform owned by Value First. ...

I work at ValueFirst Digital Media Private Ltd. I am a Product Marketer in the Surbo Team. Surbo is Chatbot Generator Platform owned by Value First.

Success story of Haptik
217 days ago

Who is afraid of automation?
217 days ago

What's happening in AI, Blockchain & IoT
218 days ago

3 million at risk from the rise of robots
218 days ago

5 ways Machine Learning can save your company from a security breach
218 days ago

Google Course for IT beginners, certificate in 8 months: Enrollment starts on Coursera today, check details
30048 views

7 of the best chatbot building plaftorms out there
15189 views

IIT Madras launches Winter Course on Machine Intelligence and Brain Research
14922 views

Could your job be taken over by Artificial Intelligence?
14739 views

You can now train custom machine learning models without coding using Google's AutoML
12738 views

Artificial Intelligence will soon have a voice like humans

Jan 4, 2018 | 5862 Views

Google's Alphabet AI Lab used their DeepMind artificial intelligence (AI) to develop a synthetic speech system called WaveNet back in 2016. This system runs on an artificial neural network and is capable of speech samples that are far better than those produced by other technologies. The AI voice is becoming more human like. WaveNet has improved since then and is now good enough for Google Assistant across all platforms. 
According to a paper by Google that is still under peer review, WaveNet is getting a text-to-speech system called Tacotron 2. This is effectively the second generation of Google's synthetic speech AI. This new system combines the deep neural networks of WaveNet & Tacotron 2.

First, Tacotron 2 translates text into a visual representation of audio frequencies over time, called a spectogram. This is then fed into WaveNet, which reads the spectogram and creates a chart with the corresponding audio elements.

According to the study, the "model achieves a mean opinion score (MOS) of 4.53 comparable to a MOS of 4.58 for professionally recorded speech." Simply put, it sounds very much like a person speaking.

In fact, Google put recordings of a human and their new AI side-by-side, and it's difficult to tell which is the person and which is the machine.
To date, AI systems have gotten better at blurring the line between human and machine. There are now AIs capable of generating images of human beings that aren't real, but look it. Another AI can even make fake videos. One can't also discount the fact that some AIs are getting better at storytelling, or making art.

Mimicking human speech was always a challenge for AI networks. Now, DeepMind's WaveNet and Tacotron 2 seem to be changing that, and at quite an impressive rate. Not only does the AI pronounce words clearly, but it seems to be able to handle difficult to pronounce words or names, as well as put emphasis on the appropriate words based on punctuations.
This does not mean that the new AI system is completely perfect. However, one should keep in mind that this current iteration has been only trained to use one voice that Google had recorded from a woman they hired. For this new system to work with other voices, it will have to trained with again. 
Besides having immediate applications for Google Assistant, as soon as Tacotron 2 is perfect, the technology could be applied to other areas. It may perhaps also take over certain jobs like other applications of AI are. 

Source: HOB