Search /
Docfinder:
Advanced search  |  Help  |  Site map
RESEARCH CENTERS
SITE RESOURCES
Click for Layer 8! No, really, click NOW!
Networking for Small Business
TODAY'S NEWS
Researchers uncover new global cyberespionage operation dubbed SafeNet
iPhone 6 rumor rollup for the week ending May 17
Newvem expands to monitor Azure and Amazon clouds
Forrester: Windows 8 faces uphill battle as corporate desktop
iPad 5 rumor rollup for the week ending May 16
Former Amazon cloud engineer spills to Reddit audience
Jive Software adds integration tool for its enterprise social platform
Lawmakers press Google on Glass privacy
eBay's CIO Succeeds by Innovating and 'Connecting the Dots'
Intel's Krzanich pledges stronger mobile push in his first speech as CEO
Google I/O After Hours: Robot bartenders, augmented reality and Billy Idol
DMARC email standards help prevent brand abuse in phishing campaigns
How to keep the feds from snooping on your cloud data
Could this be the business world’s answer to Google Glass?
Cisco cites data-center, wireless for quarterly revenue increase
Google Wallet makes payments possible through Gmail
ServiceNow wants to be the cloud for IT
Oracle renumbers Java patch updates, confuses users even more
Google I/O: A lower-key Android keynote, but devs get huge set of new tools
Nick Carr's 'IT Doesn't Matter' still matters
7 steps to securing Java
Google tells Microsoft to shut down its YouTube app for Windows Phone
Google rolls out by-the-minute cloud billing, introduces a new NoSQL database


Enterprise Networks / Product tests/info /
Send to a friend Feedback

XML gives voice to new speech apps

Related linksToday's breaking news
Send to a friendFeedback


Speech technology is evolving to the point where an exchange of information between a person and a computer is becoming more like a real conversation. Many factors are responsible for this, ranging from an exponential increase in computing power to a general advancement of basic speech technology and user interface design.

Speech-based applications deployed to date have been based on code created by a few speech software vendors. VoiceXML will likely change this landscape by virtue of its promised vendor independence in creating speech applications.

VoiceXML is the emerging standard for speech-enabled applications. It defines how a dialog is constructed and executed between a caller and a computer running speech recognition and/or text-to-speech software.

VoiceXML incorporates the flexibility to create speech-enabled Web-based content or to build telephony-based speech recognition call center applications.

Specifically, VoiceXML outlines a common language to follow when programming a speech application. In VoiceXML, many of these rules are referred to as tags, as used in HTML. Tags denote actions for creating dialog between a human voice and speech recognition system.


How it works
Subscribe to the Tech Update newsletter
  Here is a weekly newsletter to help you stay abreast of new networking standards and technologies by providing down-to-earth explanations of how they work.

An example of a VoiceXML tag would be to queue an audio output. Main components of a VoiceXML-based service include tags, forms and rules that define the content, and a speech browser for interpreting and presenting audio content.

Vocabularies and grammars are the key components that define the input to a speech-enabled page. The vocabulary consists of the words to be recognized by the speech recognition engine. For example, a vocabulary for a flight information system might consist of city names and travel-related words such as "leaving" and "fly." Grammars provide the structure to identify meaningful phrases. A vocabulary and grammar are combined within a speech-enabled application to define speech recognition within a reasonable range of efficiency for both the caller and the speech recognition processor.

Designing a speech application includes presenting data for delivery over the phone, constructing a call flow and enabling prompts and grammars. VoiceXML provides a common set of rules as a flexible foundation, but it's up to the designer to create the appropriate flow and personality for a speech system.

Just as HTML content is interpreted by a browser and presented visually over the Web, so must VoiceXML be understood or interpreted for presentation over the telephone by a speech, or voice, browser. The speech browser serves as a gateway between a call and an Internet connection. It interprets VoiceXML code and manages dialog between callers and VoiceXML content located at a Web site.

Speech browser software also maintains the calls, presents voice prompts that equate to URLs and downloads pages for audio interaction.

A VoiceXML-based application using a speech browser provides flexibility, benefiting callers and content providers alike. A caller could use a rotary telephone or the newest wireless model and receive the same service. Content providers have a choice of locating a speech browser at their facilities or outsourcing to an application service provider, carrier or service bureau. As with current visual Web models, trade-offs have to be weighed between ease of implementation, flexibility, cost and other factors.

Today, companies are building businesses on speech-based Web content by providing telephony access and presentation of data in interactive audio formats. These businesses host speech applications to provide greater scalability, maintenance and support, while letting content providers focus on their core business.

A number of obvious and subtle factors are converging to bring the Web model of VoiceXML to prominence. Many consider the broad industry support of VoiceXML its most apparent strength. Other factors such as recent improvements in text-to-speech quality mean information can be immediately presented in audio format without the time and expense of recording a voice. Looking at the evolution of the Web, it's clear the adoption of a common format for content presentation - HTML - fueled the growth of the Web as we know it today. The VoiceXML standard holds similar promise for speech.



Related Links

Chambers is vice president of marketing at SpeechWorks. He can be reached at steve.chambers@speechworks.com.

Apply for your free subscription to Network World. Click here. Or get Network World delivered in PDF each week.

Get Copyright Clearance
Request a reprint or permission to use this article.


NWFusion offers more than 40 FREE technology-specific email newsletters in key network technology areas such as NSM, VPNs, Convergence, Security and more.
Click here to sign up!
New Event - WANs: Optimizing Your Network Now.
Hear from the experts about the innovations that are already starting to shake up the WAN world. Free Network World Technology Tour and Expo in Dallas, San Francisco, Washington DC, and New York.
Attend FREE
Your FREE Network World subscription will also include breaking news and information on wireless, storage, infrastructure, carriers and SPs, enterprise applications, videoconferencing, plus product reviews, technology insiders, management surveys and technology updates - GET IT NOW.