Skip Links

WAN Virtualization technology: 'RAID for WANs'

Diverse WAN connections, continuous monitoring, sub-second response enables RAID-like revolution in enterprise WAN architecture

By Andy Gottlieb on Fri, 06/15/12 - 12:29pm.

Last time, we looked at the analogy between WAN Virtualization and RAID at the business benefits level. Here, we examine the parallels from the technical point of view.

In this post, I will refer mostly to the WAN Virtualization technology of the company I founded, Talari Networks, as obviously that is the one with which I am most familiar. Ipanema Technologies' implementation is similar in many regards, while also including WAN Optimization capabilities as well. Mushroom Networks' solution shares some of these characteristics, but not others. As with any newer technology going through rapid innovation, it's important to check with prospective vendors about the capabilities of their solution.

Wrapping hardware and intelligent software around core enabling technology

Seagate 5.25-inch hard disk technology, originally targeted at the nascent personal computer market, was not nearly as reliable as the existing mainframe/minicomputer disks, having nowhere near the MTBF (Mean Time Between Failures) or seek times. But RAID - Redundant Array of Inexpensive Disks - took advantage of the hugely better price/bit of the Seagate hard disk to revolutionize the enterprise storage market. By combining a layer of hardware and intelligent software with multiple inexpensive disks, RAID delivered a storage system with higher capacity, lower cost, competitive - and ultimately superior - access times, and greater reliability than the older-generation storage solutions.

For WAN Virtualization, the analogous enabling technology is the public Internet. By wrapping a two-ended system of appliance-based hardware running intelligent software around multiple WAN connections - most or all of which are Internet connections, but can include existing expensive MPLS connections as well - WAN Virtualization creates an enterprise WAN which is lower-cost, massively better cost/bit, higher-capacity, and more reliable than the best single vendor MPLS WAN.

Redundancy to deliver application continuity

The basic idea behind RAID - and behind WAN Virtualization as well - is that while two devices (network connections) operating in series which each have 99% reliability will deliver a system with only .99 *.99 = 98% reliability, a properly designed system with the same two devices (connections) operating in parallel will deliver 1 - (1 - .99) * (1 - .99) = 99.99% reliability.

The key phrase, of course, is proper design.

The first premise of RAID is that the loss of any single disk ensures not only that no data is lost, but also that the application - data reads and writes - continues to function normally, without meaningful performance degradation. This is essentially what RAID Level 1 delivered.

For WANs, existing traditional routed networks with appropriate link and device redundancy at each location provide network availability in the face of any hard single link failure or router failure. But this "no loss of network connectivity" is merely the equivalent of "no data loss" in the storage world. Even a no-single-point-of-failure routed network does not provide for application continuity in all cases, as a routed network can take upwards of 30 seconds at times for router convergence in the face of a given link or especially router failure. More importantly, routing does not handle the case where packet loss or excessive latency causes significant problems with application performance. Yet these "soft failures," due to congestion on shared IP networks, especially shared WANs, occur with far greater frequency than hard link or device failures.

In fact, it is precisely because of congestion-based packet loss and jitter, which occurs most frequently at Internet peering points between ISPs, that the public Internet has earned its "works pretty well most of the time" reputation. Of course, "pretty well" isn't good enough for most WAN managers, and "most of the time" isn't good enough for anyone.

Handling "soft failures" - and doing it quickly!

WAN Virtualization, by leveraging multiple paths across the network between locations, ensures that no single hard or soft failure of a network link, device or peering point in the middle of the Internet will cause a loss of connectivity or application performance predictability.

WAN Virtualization does its equivalent of RAID Level 1 by continuous measurement of network path performance (loss, jitter, latency, bandwidth) combined with sub-second reaction to problems with any network path. The best WAN Virtualization implementations can move traffic off of a path experiencing high loss or excessive jitter in typically fewer than 3 round-trip times (RTTs) from problem occurrence.

For TCP applications, some WAN Virtualization solutions also deliver further application performance predictability by buffering packets from flows and retransmitting them in the face of loss, making the WAN look to the applications like a zero-loss network with occasional bouts of jitter.