• United States
Senior U.S. Correspondent

Microsoft releases Speech Server beta

Jul 10, 20034 mins

Microsoft on Wednesday moved toward the integration of call centers and the Web with the release of the first public beta of its Speech Server and a new beta version of its Speech Application Software Development Kit.

Microsoft on Wednesday moved toward the integration of call centers and the Web with the release of the first public beta of its Speech Server and a new beta version of its Speech Application Software Development Kit (SDK).

The software platform is designed to host voice-based services similarly to the way Web servers host a company’s Web site, as well as supporting “multimodal” applications that take advantage of both voice and Web interfaces. It is based on Speech Application Language Tags (SALT), an extension of current scripting languages including HTML and XML.

Companies that need call centers can cut costs by automating them on the server, said Xuedong Huang, general manager for Microsoft speech technologies. Among other things, the server can interpret callers’ requests and provide recorded or synthesized responses. Developers also can integrate the voice-based services with Web-based applications that can continue to run on a Web server as they do now. For example, a caller could ask for a stock quote verbally and have it displayed on a handheld device, he said.

The beta version of the server can deliver voice-only services to a wired phone and multimodal services to any device with a screen that uses either a wired or a IEEE 802.11 wireless LAN connection to the server. Other wireless technologies will be supported later, Huang said. The software includes a speech recognition engine for handling users’ speech inputs and a prompt engine to bring up prerecorded prompts from a database to play for users. It also has a text-to-speech engine that can synthesize audible prompts from a text string when a prerecorded prompt is not available. In addition, it has a SALT Interpreter and other components to support services to callers.

The SDK, a set of tools and controls based on SALT, lets developers build telephony and multimodal applications. Microsoft released it Wednesday in its third beta version. The SDK is designed to make it easy for developers to incorporate speech functionality into Web applications and to build speech applications using Visual Studio .Net 2003, according to a statement by Microsoft. New features in the third beta include Pocket Internet Explorer Bits for Pocket PC access to Microsoft Speech Server applications, a simulation of the Speech Server and preset controls for managing responses containing digits and letters, such as credit card numbers.

Voice is one user interface that could be used with any type of device, Huang said. Not everyone has a PC but most people have phones, and speech may be the best way to interact with small devices, he said.

The SALT Forum has submitted SALT 1.0 as a specification to the World Wide Web Consortium (W3C). The group has more than 70 members, including founding members Microsoft, Cisco, Intel, Koninklijke Philips Electronics, SpeechWorks International and Comverse, Huang said.

SALT is a more lightweight extension of current markup languages than is Voice XML, a specification being used by many voice-based services developers today, according to Mark Plakias, an analyst at Zelos Group in San Francisco. As a result, it allows companies to draw upon a larger pool of developers than does Voice XML, which is more familiar to developers of traditional integrated voice response (IVR) systems, he said.

“There are a whole lot more Web monkeys out there than there are IVR jocks,” Plakias said. That is not much of a concern for carriers, which have enough experienced IVR developers, but would make a difference to enterprises in some cases, he added.

Both Plakias and Microsoft’s Huang look to the two specifications eventually merging under the W3C. Plakias said that could happen as soon as the end of 2004. Huang was less specific.

“We want to find a way to converge with Voice XML, but how we’re going to do that, I don’t know,” Huang said.

Microsoft expects to ship the first production versions of the server and SDK in the first quarter of 2004, Huang said. Pricing information was not yet available.