Companies have rapidly deployed VoIP technology to reduce operational costs and increase productivity by integrating telecom services with other enterprise workflows. However, the codecs most widely used today that let speech be transmitted over IP data networks do not reproduce speech faithfully for a variety of dialects.
Typical human speech, while consisting of a wide range of frequencies, is very well reproduced by sampling frequencies up to 10KHz. Unfortunately, most VoIP codecs capture well under a half of this frequency spectrum. The G.729 voice codec, one of the most common codecs used in VoIP systems, is a narrowband codec that samples across a frequency range from about 200Hz to just under 4KHz. As a result much of the frequency spectrum of typical human voice is not captured.
To minimize the amount of data required to encode the sampled speech to limit the requirements of network bandwidth as much as possible, algorithms such as G.729 use a variety of encoding techniques that make certain assumptions about speech.
In many cases these assumptions have been made based on a hypothetical Western male voice. These predictive encoding techniques coupled with the limited frequency sampling result in a voice codec that poorly recreates Asian speakers because of the higher frequency content of many Asian dialects. These narrowband codecs also do a poor job in the transmission and encoding of music.
In an attempt to solve these problems, other codecs were defined that double the audio-frequency spectrum encoded. Wideband codecs sample frequencies up to 8KHz. By doubling the frequencies that are encoded, a truer representation of the speed or audio can be encoded and recreated on the remote end. As most speech can be represented with only 10KHz of frequency, these wideband codecs are able to more faithfully reproduce the original speech than narrowband codecs. Music transmission also becomes possible using wideband codecs.
The G722.2, or Wideband Audio Modem Riser, wideband codec was defined initially by the European Telecommunications Standards Institute/Third Generation Partnership Project as Wideband AMR for use in cellular and mobile applications and then ratified by the International Telecommunication Standardization Sector as G.722.2 for use in VoIP and other applications. G.722.2 supports nine bit rates from 6.6K to 23.85Kbps. Bit rates as low as 12.65Kbps still can deliver acceptable voice quality.
Real vs. perceived difference
G.729 requires only 8Kbps to encode voice, but this is before any additional data for Real-time Transport Protocol (RTP), User Datagram Protocol (UDP) and IP packet headers. When taking the RTP, UDP and IP packet headers into account, G.729 requires 29.6Kbps of effective network bandwidth and G.722.2 requires 28.2K to 45.45Kbps, depending on the rate used. With the largest G.722.2 data rate, just over 50% of network bandwidth is required to achieve twice the transmitted frequency and the associated audio fidelity benefits.
To deliver the superior audio quality of wideband codecs, the algorithms used to implement these codecs typically are more computationally intensive and thus require more processing capacity for the VoIP endpoints.
As such, many existing VoIP devices may not be capable of supporting wideband codecs. Fortunately, as processor capacity continues to increase, the benefits of wideband VoIP are becoming an option in an increasing number of devices at lower prices.
Ward, director of product line management at Trinity Convergence, can be reached at firstname.lastname@example.org.
Learn more about this topicVoIP quality over the Internet vs. corporate IP network, Part 1
10/02/06Users want systems for managing VoIP quality
09/18/06No waste in Kimberly-Clark VoIP plans