This Week in NW
Speech technology grows up
Speech applications can save money and the technology is moving into advanced applications.
Advanced speech technology has moved to the mainstream - just ask Bob DuPont of Thrifty Car Rental, who expects to save $1 million next year using voice recognition software.
Voice recognition technology is gaining popularity for a couple of important reasons. First, it can save money at call centers. A recent study by AMR Research says call costs range from $1 to $10 each depending on call length and an agent's skill level. Speech recognition software can reduce that cost to as low as 20 cents per call. Second, the technology is advancing rapidly, which lets users deploy it in new wireless and business applications.
In the coming months, voice technology will only get better, observers say. Industry experts and vendors expect support for VoiceXML, a specification that would enable speech-based applications and online information to become phone and voice accessible, and the infusion of speech recognition in wireless devices, such as cell phones and PDAs, to flourish.
"Speech technology overall has matured quite a bit," says Dennis Gaughan, an analyst at AMR. The software has become more speaker-independent, picking up on speech nuances and dialects. Before, users had to sit at their desktops and train their computers - often several times - to recognize their voices and speech patterns.
Thrifty has deployed SpeechWorks' interactive speech recognition software to handle customer requests for car rental quotes.
Customers who call Thrifty's reservation number are prompted to give information regarding dates, times, car size, city and airport, and then receive reservation information. When a customer wants to book a reservation, he is transferred to a sales agent. The agent receives the calls and information containing the customer's requests on his computer screen.
The car agency has handled more than 200,000 calls so far through the system, and it plans to push over more by summer's end. Thrifty receives 4 million calls per year with 30% to 40% coming from customers checking rates and availability, according to DuPont, staff vice president of reservations.
"Touch-tone [calls] are difficult, but with speech technology we're presented with a viable option" to help customers, DuPont says. The software has helped Thrifty respond faster to customer calls, but the company is still concerned about alienating customers.
"We're trying to figure out [ways customers] won't hang up without talking to a live voice," he says. Thrifty is working on incentives to encourage would-be customers to use the automated system. "So we're not at risk of losing them, we're still offering a live voice for reservations."
In addition to Thrifty, United Airlines and T. Rowe Price are two companies that have recently implemented interactive speech systems. Speech technology is also expected to penetrate in areas such as inventory tracking and salesforce automation, according to industry experts. For example, salespeople could prompt for information regarding their contacts and calendars through a phone.
The biggest noise for speech technology is coming from wireless customers, who are grasping for easy-to-use features.
"Inputting information in a cell phone is terrible - the screen is small and so too is the touch-tone pad," Gaughan says.
Vendors such as Motorola, IBM, SpeechWorks and Nuance Communications are developing hands-free software that would let end users speak into a cell phone for tasks such as defining contact information or surfing the Internet. For instance, customers could look at their cell phones, ask for directions and receive information in the form of a map on their cell phone screens.
One of the main drivers of speech technology in the coming months will be the adoption of VoiceXML, which basically outlines a common way for speech applications to be programmed. With the adoption of VoiceXML, businesses would only need to build an application once and then could run it on multiple vendor platforms.
VoiceXML is the brainchild of IBM, AT&T, Lucent and Motorola, and is currently supported by more than 500 companies, including Nokia, Sprint PCS, Nuance and SpeechWorks. SpeechWorks recently rolled out its VoiceXML-based speech recognition engine OpenSpeech Recognizer 1.0; Nuance, Lucent, IBM and others have implemented VoiceXML into their products.
"The industry has been waiting a long time for open standards," says Nigel Beck, director of voice systems at IBM. "Developers are required to know three areas - speech; how to deal with phone infrastructures; and write business applications and link them up to existing systems."
Industry experts say developers of speech technology will rally around VoiceXML, but that, like other versions of XML, it still must gain larger industry acceptance.
John Nallin, vice president of information systems at United Parcel Service (UPS), says VoiceXML will ease the burden of accessing text-to-speech and speech-to-text applications. Speech technology products need to interpret words better than they do now, he says. For example, businesses have to write APIs to access databases and if the speech technology is proprietary "then it's one more that you have to worry about," Nallin says.
UPS customers currently track orders online through the package carrier's Automated Tracking System, an interactive, voice response application using Nuance's speech recognition and natural-language understanding server. Nuance develops voice interface software platforms.
UPS, which has used the server for almost five years, tracks more than 100,000 requests per day. The system is supposed to respond with package status information in less than two seconds and cut down a tracking call to about 65 seconds.