Real Time Communications Featured Article

Real Time Translation Next Value-Add to Real Time Communications

November 04, 2014

Microsoft has opened up its Skype Translator feature for a "preview release." I don't know if that means beta or "Free until we start charging," but either way it is a hot move. Real-time translation might be the next big feature for Real Time Communications (RTC), so we should all start paying closer attention.

The preview program is free, will have limited spots available and will be initially available for Windows 8.1 computers and tablets only with a "limited" set of languages according to Microsoft.  Your conversation in a Skype video or voice call will be translated to the other participant's language in real time and vice versa, with an on-screen transcript of the call displayed. Instant message chats in 45 languages can be displayed.

"It'll have a few rough edges but the more conversations it translates, the better it'll get," states the Skype Translator preview signup.  Opening up the "preview" gives Microsoft more speech data to crunch, further improving the system as you get more "machine learning" with larger data sets to work with.

Microsoft demonstrated Skype Translator at several events earlier this year, and builds on research translation work the company has conducted in speech recognition, translation, and language processing.  Hypervoice advocates will recognize both the difficult and value of being able to interpret speech in real time and then delivery a specific result. 

Simply delivering real-time transcription from speech to text in the flow of an audio or voice call is pretty heavy duty.  Skype has an advantage because it uses the Opus superwideband voice codec to capture and deliver more audio information than a traditional narrowband or basic HD voice call, giving Microsoft's back-end cloud services more data to work with in the process of speech interpretation.

Speech-to-text transcription has been an optional feature with some conference call services. For example, provides transcription and Hypervoice key word indexing through VoiceBase's Keywords service.  Indexing is free, but a transcript is a value-added (i.e., pay) service.

Anyone working with Smart Voice services has to be thinking when and how easy it will be for third parties to access Microsoft's cloud-based translation engine. The technology is a building block for an array of value-added services that can be put on top of generic voice cloud offerings, ranging from simple call recording and transcription to Hypervoice, voice analytics, and real-time processing to trigger events and actions. 

Imagine being able to have order fulfillment occur in real time as a call center agent is talking to a customer without the agent having to type in a single thing.  The agent can focus on checking a customer order for accuracy and provide guidance on additional features/products that might be appropriate to add to the basic order.

Splicing in real time transcription/translation to WebRTC applications is likely to be very lucrative for Microsoft in the future, given the many ways such a fundamental building block can be used for constructing new voice-based services.

Edited by Stefania Viscusi

Article comments powered by Disqus

  Subscribe here for RTCW eNews