Skip Links

Addressing WAN packet loss, Part 2

Handling loss differently, mitigating the effects and hiding the loss from end stations can improve WAN application performance.

By Andy Gottlieb on Mon, 12/03/12 - 12:17pm.

Last time, we began our discussion of what can be done to address the impact of packet loss on application performance over the WAN. We listed six different possibilities, and went through how one of them can significantly improve application performance in the face of packet loss. Today we'll cover two more techniques: mitigating and hiding the effect of loss from the end station, and reacting differently to observed packet loss.

We saw that drastically reducing the number of packets that traverse the WAN using one or more of replicated file service, local web (HTTP) object caching and WAN Optimization's data deduplication, and CIFS-specific application proxy technologies, will greatly improve application performance in the face of WAN packet loss. But this is by no means the only method, and in fact those methods are either very application-specific or work only when the data in question has already traversed the WAN once already. The techniques we'll cover today and in our next column will work for all TCP (Transmission Control Protocol) applications, and some will work for real-time applications as well.

The first technique to mitigate the effects of packet loss is to use Forward Error Correction (FEC). FEC uses additional overhead along with the packet stream in order to correct errors in a data stream without requiring retransmissions. Silver Peak is a WAN Optimization vendor that promotes their use of FEC.

FEC works well when there is consistent, uniformly distributed low-to-moderate packet loss. This can happen if there are bit error rates on a faulty last mile DSL line, for example, although such faults are far less frequent than they used to be. But packet loss in the WAN is almost always caused by congestion-based dropping of packets by routers (or some other forwarding device) along the path between locations. And in fact, congestion-based packet loss is decidedly not typically uniformly distributed; rather it is bursty. And in particular, it's unpredictable as to the duration of the loss. Many loss durations are very brief, and a few are very long – and there is no way to tell in advance what the duration of the congestion event will be. No reasonable FEC overhead will successfully reconstruct the data when two consecutive large packets are lost, for example. Because of this, FEC – even "adaptive" FEC, which attempts to use more forward redundancy when loss rates seem higher – is almost always ineffective in practice. It uses additional bandwidth for the error correction, and yet will almost never be able to handle the runs of high packet loss that have the greatest impact on application performance.

Another technique to mitigate the effect of packet loss used by most WAN Optimization solutions is to do TCP termination at each WAN Optimization appliance, and combine this with a different technique than standard TCP for communicating between the two appliances. In this way, the packet loss is hidden from the end stations, so they don't cut back their offered TCP window size. (The WAN Optimization appliances will buffer traffic as needed.) While the TCP termination is primarily done in order to most effectively use techniques like compression and data deduplication, under certain circumstances it can also improve application performance in the face of packet loss.

Most commonly, each WAN Optimization device will run either a proprietary version of "high-speed TCP" or perhaps an RFC 3649-compliant implementation. When attempting to fill a high-bandwidth WAN connection with fairly large latency, even an occasional single packet loss under ordinary TCP can drastically reduce the amount of bandwidth utilized, because TCP is designed to cut back the window size by half in the face of a single lost packet, and grow the window size relatively slowly as acknowledgements are received. High Speed TCP implementations fix this problem and work well under low loss (i.e. packet loss rates much less than 1% over any useful timeframe).