

San Francisco: Google has announced its first direct speech-to-speech translation system called “Translatotron” that can convert verbal communication from one language to another while maintaining the speaker’s voice and tempo. “Translatotron” is based on a sequence-to-sequence network which takes source spectrograms — a visual representation of frequencies — as input and generates spectrograms of the translated content in the target language. The model makes use of two other separately trained components — a neural vocoder that converts output spectrograms to time-domain waveforms and a speaker encoder that can be used to maintain the character of the source speaker’s voice in the synthesised translated speech. (IANS)
Also Read: INTERNATIONAL