How to address application performance issues from WAN latency variability

Certain techniques work better to deal with the effects of congestion-based increases in latency application performance over the WAN

Continuing the broad topic of which of the various technologies, including those that are part of the Next-generation Enterprise WAN (NEW) architecture, like WAN Optimization, WAN Virtualization and Network-as-a-Service, and other, older technologies as well, that best address the different issues impacting application performance over the WAN.

As you saw last time, there are a number of techniques that can address the "fixed" component of WAN latency in the quest to improve application performance. This time we'll cover those techniques that address the more difficult problem of jitter, the variable, queuing congestion-based component of latency.

Note that we typically talk of the problems of "jitter" when the application is a real-time one, like VoIP or videoconferencing. This is because too much variability itself can be a problem in people communicating live with each other, even if the absolute latency number is not a problem. How to properly support real-time applications deserves a column all its own; we'll cover this next time.

With TCP applications, jitter – again, latency variability – per se isn't a problem when the jitter is of a low to moderate amount. It does become a problem, however, when the jitter is very high. Since the size of buffers in the typical WAN router are between 100 and 200 milliseconds, additional latency of 80 to 400 milliseconds or more is quite possible when running any application over a wide area network, especially over long distances. And so where a user may find application performance perfectly acceptable with an average round-trip latency of 80 milliseconds, and typical jitter of 10 milliseconds, during those times when latency spikes by 80 to 400 milliseconds, suddenly application performance is not so acceptable. [And this doesn't even take into account the affects of packet loss, which can be even greater. But we'll cover packet loss in more detail in future columns.]

Of course, before covering any other techniques, it's worth noting that it is essential to implement QoS properly on your WAN to ensure that the performance of your real-time and interactive applications aren't adversely affected by use of limited last-mile bandwidth by your own other applications. QoS was probably the first technology used to address this issue. I won't attempt to detail here how QoS should be implemented, as there are many sources for this on the Internet, and the proper use of QoS – mostly involving putting different applications into different classes, with differing weights and/or bandwidth limits – in your routers and other WAN edge middleboxes is both essential and compatible with pretty much all of the techniques I'll cover here. 

Then there are the other basics: having more WAN bandwidth will minimize the amount of time that your own traffic, or traffic of the same class or kind as the application(s) you most care about, will impact application performance. This is again well known, and as with almost all of these interrelated issues that affect application performance, we'll cover what you can do about bandwidth limitations in... you guessed it: a future column!

All of this is designed to get us around to the truly "hard" problem of addressing congestion caused by other traffic which is not your own when using shared WANs. As noted last time, you could just buy point-to-point links between your locations to avoid this problem, but for most enterprises most of the time, that simply is not a cost-effective option.

You can, of course, buy MPLS to connect all of your locations together as the way to avoid congestion-based WAN latency. This expensive solution actually does address this problem for domestic connections the overwhelming majority of the time, since domestic MPLS networks tend to be overengineered sufficiently to avoid most congestion. It doesn't always address the problem for overseas connectivity, however, as those links tend not to be overengineered, since doing so is too expensive for the telecom SP. And, again, it's a very expensive solution that offers relatively little bandwidth, and so for most companies it's not a panacea.

So what else can you do? Turns out there are a number of things.

Two application-layer solutions we covered last column when addressing fixed latency also apply to improving performance under variable latency. Replicated file service avoids WAN latency in accessing files, delivering LAN-speed performance because all client access to the data is done locally. "Static" caching of objects, such as with local web caches or Content Delivery Networks (CDNs), applies here as well. 

Data deduplication techniques offered by WAN Optimization vendors essentially do "dynamic" caching of data locally, and while these require at least one round-trip across the WAN, they will always involve far fewer such round-trip transactions than when the data is not stored locally. For the very chatty Microsoft CIFS protocol, this capability is usually combined with an application-specific proxy that will reduce round-trip requests still further. So while WAN Optimization doesn't do anything about the increased latency itself, it reduces the impact it has on application performance, often to a very signifcant degree.

Network-as-a-Service addresses the problems of congestion-based latency in international connections, particularly those across oceans and using Internet connections. Connections between locations across the Internet frequently experience congestion because of the number of peering points they traverse, due to the economics of the Internet, and "hot potato" routing, which causes ISPs to have traffic exit their networks as quickly as possibly if the final destination is not on their own network, even though that means that traffic might traverse a large number of peering points. These peering points are where the chance of congestion is greatest. As noted last time, a Network-as-a-Service solution with a dedicated core network and colocation-based Points of Presence (PoPs) close to end-user locations solves this "middle mile" congestion problem by avoiding these routing issues, delivering stable low-latency connectivity between even far-flung locations. It is an ideal solution to congestion-based latency, at a fraction of the cost of MPLS.

Finally, WAN Virtualization addresses congestion-based latency most directly – at the time it occurs. Because it is continuously measuring the one-way latency across all of the possible paths between any two locations, when it detects significant congestion-based latency on a path, it will quickly move latency or jitter-sensitive traffic off that path onto a better performing path, limiting use of the now slower congested path only to things like file transfers, which consume bandwidth but are not otherwise sensitive to higher latency. WAN Virtualization can be used in conjunction with most of the other techniques described above to both speed up performance in the "average" case, and, as importantly, avoid "worst case" latency increase scenarios which make applications unusable, and can make users and senior management irate.

Assuming QoS has been implemented properly, Network-as-a-Service and WAN Virtualization are really the only approaches that can do much for VDI (Virtual Desktop infrastructure) deployments which are having trouble running successfully either over Internet connections or across overseas connections, as each offers valuable techniques that traditional WAN Optimization appliances cannot for what are usually already highly optimized protocols, for an application where predictable interactive performance is critical for user productivity.

So there are many approaches that can help address latency variability for TCP application performance improvement and predictability. Next time, we'll look at the techniques for dealing with latency variability (jitter) for real-time applications running over the WAN.

A twenty-five year data networking veteran, Andy founded Talari Networks, a pioneer in WAN Virtualization technology, and served as its first CEO, and is now leading product management at Aryaka Networks. Andy is the author of an upcoming book on Next-generation Enterprise WANs.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2012 IDG Communications, Inc.

IT Salary Survey: The results are in