Addressing WAN packet loss – go where the loss isn’t

WAN Virtualization enables reliable, predictable high-performance in the face of packet loss on a network path.

We continue our discussion of what can be done to address the impact of packet loss on application performance over the WAN. In our previous 3 columns, we've covered five of the six different possibilities listed in the first column of this arc. Today, we'll cover the last technique: avoiding the additional loss that often follows after a burst of loss.

We saw previously that many core WAN Optimization features have a significantly positive performance impact in the face of packet loss: drastically reducing the number of packets that traverse the WAN using application-specific technologies like replicated file service, local web caching, or CIFS proxies, as well as data deduplication. We saw that sometime Forward Error Correction (FEC) can be of value, though usually not. TCP termination technology in conjunction with a different approach than standard TCP (Transmission Control Protocol) for communicating between the two locations over private/dedicated WAN can help a great deal, as can having a private, undersubscribed core to avoid much loss in the first place.

And for very long-distance communication, or when trying to connect to public cloud-based services, we saw last time that enabling end stations to react more quickly to loss by having globally distributed Points of Presence (POPs) at colocation facilities close to end-user locations, and using a multi-segment TCP optimization technology approach, can have a significantly positive impact on TCP application performance in the face of first-mile/last-mile packet loss.

Avoiding the additional loss that often follows after a burst of packet loss is uniquely a feature of certain WAN Virtualization implementations. Most of the time there is very little packet loss on most networks. Given the nature of TCP (we looked at this at some length previously here) and shared IP networks, though, it turns out that not only is packet loss not uniformly distributed - as simple WAN emulators model it, and most quoted performance numbers do as well - but it is completely unpredictable how long episodes of packet loss will endure. Much of the time it is gone "immediately," much of the rest of the time it is gone soon after that, much of the rest of the time it is gone soon after that, etc., etc. A small portion of the time packet loss remains high for an extremely long time. But since there is no way of knowing how long a "loss episode" will go on, if one wants predictable application performance, it's simply not safe to use a network path which is exhibiting a run of packet loss (versus simply an isolated packet or two lost every so often) until said network path has demonstrated that packet loss has gone away.

The idea is to leverage the fact that in a multi-link aggregation solution, there are multiple paths between any two WAN locations. A WAN Virtualization implementation that continually measures packet loss in real-time and can react quickly enough can switch away from a network path exhibiting a run of loss. As importantly, it can start to use that path again for user traffic only after the path has demonstrated (using heartbeat packets) that it is no longer experiencing packet loss.

Routing protocols don't measure packet loss, network monitoring tools that do measure loss can't really do anything about it, and while traditional WAN Optimization solutions do their best to minimize and mitigate the effects of loss, they can't address the core problem: the loss itself.  With WAN Virtualization, you can do more than get a weather map and a real-time weather status - for the first time you can actually do something about the weather! This is why WAN Virtualization enables the use of otherwise relatively high loss (and so less predictable) inexpensive Internet links, and can still deliver reliability and application performance predictability that exceeds that of private Multiprotocol Label Switching (MPLS) WANs. Mitigating the effects of loss combined with avoiding those cases of long periods of loss is the key to this.

This concludes our look at techniques that address the performance problems caused by WAN packet loss.  As you might imagine, a good Next-generation Enterprise WAN (NEW) architecture is likely to incorporate many of these techniques, in order to deliver reliable, predictable, scalable high performance WANs at the lowest possible cost.

A twenty-five year data networking veteran, Andy founded Talari Networks, a pioneer in WAN Virtualization technology, and served as its first CEO, and is now leading product management at Aryaka Networks. Andy is the author of an upcoming book on Next-generation Enterprise WANs.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2013 IDG Communications, Inc.