Google AI can focus on individual speakers in a crowd

Published on:

13 Apr 2018, 6:30 pm

San Francisco, April 13: Just as most smartphone cameras now allow users to focus on a single object among many, it may soon be possible to pick out individual voices in a crowd by suppressing all other sounds, thanks to a new Artificial Intelligence (AI) system developed by Google researchers. This is an important development as computers as not as good as humans at focusing their attention on a particular person in a noisy environment. Known as the cocktail party effect, the capability to mentally “mute” all other voices and sounds comes natural to us humans.

In a new paper, the researchers presented a deep learning audio-visual model for isolating a single speech signal from a mixture of sounds such as other voices and background noise. The method works on ordinary videos with a single audio track, and all that is required from the user is to select the face of the person in the video they want to hear, or to have such a person be selected algorithmically based on context. The researchers believe this capability can have a wide range of applications, from speech enhancement and recognition in videos, through video conferencing, to improved hearing aids, especially in situations where there are multiple people speaking. (IANS)

Top News

No stories found.