Getting to the vaunted five nines

These guidelines will help you create an ultra-reliable IP telephony infrastructure.

If you had to choose, what could you live without: dial tone or e-mail? That's not a choice network executives want to make. But forced into it, most probably would pick a failed e-mail system as the lesser of two evils. They know a phone network gone dead on their watch is the quickest route to the unemployment line.

That fear has been the pall hanging over VoIP for years. "The network is down" is not an acceptable explanation when it comes to phones. For that reason, many companies have been reluctant to bet their telecom infrastructure on commodity servers, IP WANs and phones plugged into Ethernet switches.

But as companies evolve to the new data center model of computing, the benefits of replacing disparate PBX and key telephone system hardware throughout a corporation with a centralized cluster of IP PBXs are getting harder to ignore. Hosted and managed from the glass house, voice can be treated just like any other application. Plus, these days, reliability doesn't have to be an issue, experts and experienced users say.

Achieving Ma-Bell-like reliability with VoIP simply means building a network with redundant call-processing hardware and gateways, providing ubiquitous power backup, and implementing best practices in securitypatch management and virus protection.

Architecting five nines

First, understand your bandwidth requirements, says Ray Ortega, voice and video infrastructure consultant with ThruPoint, a New York integrator that has installed IP voice and data networks for many large companies. IP PBXs, network gear and IP phones all can be up and running, but poorly engineered bandwidth can lead to congestion and make the VoIP network as useless as if an IP PBX or router had crashed.

Ensuring that doesn't happen starts by selecting the right codec, or compression method, for encoding and decoding packetized voice. The ITU-standard G.711 codec, which compresses VoIP to 65K bit/sec, makes sense on LANs, while the G.729 codec, with 9K bit/sec compression, is suited for lower-bandwidth T-1 or broadband shared WAN links, Ortega says. Some vendors promote the use of other ITU codecs - such as G.722, which supports higher-frequency voice - but the G.711 and G.729 are the most widely deployed, he adds.

"It comes down to determining what quality a customer wants," Ortega says.

Redundancy of switches, routers and call processors should be the next consideration in your VoIP blueprint.

"We try to split the load across the two active servers," Ortega says of the converged networks ThruPoint has architected for companies such as Deutsche Bank, Merrill Lynch and Morgan Stanley. Load-balanced IP PBXs, available from vendors such as Avaya and Cisco, can run in one data center or in separate data centers, in case a primary site is cut off. When choosing the latter, Ortega adds, you must take network latency into account. WAN links must be measured for delay and jitter; delay greater than 100 millisec could cause a problem with voice quality.

For IP telephony, getting to 99.999% reliability also means making sure power to the VoIP network isn't lost. Traditional PBXs supply power to phones, requiring only the phone switch to be on a back-up power supply. But with IP telephony, you need to think about power backup for the servers, as well as the LAN switches and WAN routers. Many of the latest IP phones can be powered via power-over-Ethernet switches, but earlier models might need to run off of AC adapters with battery backups.

Uninterruptible power supplies (UPS) - basically giant batteries - are available for all components of a VoIP network. Coverage can range from 15 minutes of back-up power to many hours depending on the types of devices used. "If businesses want to sustain hours of phone service through a blackout, they have to plan differently than if they're just trying to survive a quick glitch," Ortega says.

Planning for at least one hour of power backup is a good idea, Ortega suggests. Longer-running battery backup is available but can be overkill. He notes, however, that for hospitals, public safety organizations or government offices that cannot go offline, generators usually are needed.

Putting plans into practice

Use of such best architectural practices has kept the Nevada County, Calif., VoIP network running for the last two years with only 5 minutes of downtime. And the downtime, for system maintenance, was planned, says Bill Miller, desktop services manager for the county.

For starters, Miller uses virtual LAN (VLAN) technology to make sure voice does not contend with data for bandwidth. In the data center, the server farm hosting e-mail and office applications plugs into a 3Com Switch 4007 Layer 3 switch. Another Switch 4007 connects redundant 3Com SuperStack NBX 750 IP PBXs, which provide voice service to 600 county workers. These redundant NBXs sit on separate subnets. The live one is accessible to the network, and the backup is on its own VLAN.

"If someone can't get through to the schools or city hall because the phones are out, I'm the one who gets kicked around," Miller says.

If the primary NBX were to fail, Miller would receive an alert on his pager and by e-mail. He then would change the IP address on the back-up NBX to the same number as the primary one that failed. He also would switch the VLAN of the backup to the main voice subnet.

Five rules for five nines

VoIP network designers share this advice for building a fail-proof infrastructure for IP telephony.
Test the network: Determine IP traffic bandwidth availability, jitter and delay before deploying.
Duplicate: Run redundant call-processing servers and separate them geographically if possible.
Duplicate, again: Make sure call-processing servers themselves have redundant processors, network interface cards and disk drives.
Patch often: IP PBXs are servers, so keep them updated with the latest operating software fixes.
Batteries included: Plan on back-up power supplies for call processors, routers, switches and phones.

Miller closely monitors the voice network using VoIP monitoring and management appliances from start-up Qovia. Should he ever need to use the backup NBX, for instance,"We are currently working jointly with 3Com and Qovia to go out and tickle all my IP phones to do a reboot," he says. "When they would reboot, they would look for the [live NBX]. I'd be back up within 20 minutes without having to leave my house."

Using the Qovia tools, he also can set alerts on traffic activity and error messages on IP PBX and WAN equipment, and monitor T-1 cards on the county's voice gateways. The Qovia devices send e-mails to the IT staff if the equipment has an unusual number of error messages - usually the warning just before a crash. SNMP-enabled UPS hardware from APC also lets Miller tap into the health of his power back-up equipment.

"We're tweaking and adjusting the [VoIP network] to a point where it almost takes care of itself," he says.

Servers at the core

Of course, ensuring reliability of the VoIP network is only half the story. The other half deals with the operating system for the VoIP server.

"Suppose you have an IP PBX with triple redundancy in a nuclear shelter. If it's running on an unpatched version of Windows NT, there's a huge vulnerability," says Bob Rosky, senior security consultant at ThruPoint.

Rosky has several recommendations for making sure the server operating system doesn't cause reliability or security problems for VoIP. First make sure your VoIP server runs an absolute minimum number of services.

As to the type of operating system, "it's like asking if a Ford is safer than a GM," Rosky says. "It's how you drive it. Clearly, there are more vulnerabilities in Windows than in an AIX-type of [operating system]. But that's because there are a hundred times more [Windows] systems out there. The [operating system] should not be the No. 1 factor in deciding on an [IP PBX], but it can be a huge caveat if not implemented correctly."

Cisco, for example, ships its Windows 2000-based Media Convergence Server (MCS) platform for the CallManager IP PBX software with a custom-built Windows image that minimizes the services, applications and background software of the operating system.

Plus, when Microsoft issues patches for the servers, Cisco tests the patches and issues its own version of the software fixes on the MCS. "We tell our customers not to apply Microsoft's patches. Not all modules are on our systems, and some of the patches from Microsoft could cause more problems than they solve," says Bill King, the vendor's technical marketing manager.

Cisco also has hardened the MCS platform, which runs the CallManager IP PBX software, to make it as reliable as big-iron PBXs, King says. The hardware has built-in redundancy throughout, with dual Intel Xeon processors, memory, network interface cards, power supplies and disk drives with RAID configurations.

"When you add the software that's been pre-tested and pre-certified, with all the patches, it makes for a highly available combination," King says.

Still, for those IT executives who don't trust Windows for voice reliability, Cisco has plans to port CallManager to Linux later this year while continuing to support and enhance CallManager on Windows.

Inside the box

Vendors of legacy PBX gear also are embracing the commodity hardware and software architectures used on IP PBXs, but taking the same cautious route they took with the old, proprietary, monster phone switches.

"Our high-grade telephony systems have redundant capabilities throughout," says Mark Bissell, product manager for IP telephony at Nortel. Even the latest versions of Nortel's TDM-based Meridian 1 PBX use Intel processors and commodity disk drives and other components. Nortel offers IP PBXs that run on Intel-based hardware, with software ranging from embedded Unix, to Windows and Linux on various systems.

Nortel would not have thought to put its telephony applications on Intel servers a few years ago, but the component landscape has changed, he says.

"The newer generation of PC-based hardware is becoming extremely reliable," Bissell says. "When you combine that with redundant architectures, we're finding that we can make them as reliable as the proprietary systems."

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2004 IDG Communications, Inc.

IT Salary Survey 2021: The results are in