Skip Links

How WAN deduplication can aid WAN acceleration

WAN deduplication supports better application performance

Wide Area Networking Alert By Steve Taylor and Jim Metzler, Network World
August 28, 2008 12:02 AM ET
Jim Metzler
Sign up for this newsletter now!

Insightful analysis by consultants Steve Taylor and Jim Metzler, plus links to the latest WAN news headlines

  • Print

Jim participated in a recent Network World chat on the topic of application performance management and WAN acceleration. This is the third in a series of newsletters that will expand upon the questions raised during that chat. Today's topic - WAN deduplication.

WAN deduplication has the same goal as compression and caching - to send less data over the WAN. The way WAN deduplication works is that the first time that a file is sent across the WAN, deduplication algorithms identify data patterns in the file and store these patterns in WAN optimization appliances on each end of the WAN link. On subsequent passes, the deduplication algorithms in the appliances identify these patterns and replace large pieces of data with notably smaller pieces of data.

However, the way that WAN deduplication is implemented impacts how effective the solution will be. One of the implementation options is the layer of the OSI stack where the deduplication is applied. For example, a WAN deduplication solution that works at the TCP layer (often referred to as Layer 4 or the transport layer) optimizes only TCP traffic. Any applications that use UDP, proprietary or encapsulated protocols (i.e., GRE, and IPv6 tunneled inside of IPv4) can only be optimized by a deduplication solution that works at the IP layer.

Another option is whether the solution is disk or RAM-based. Disk-based systems typically can store as much as 1,000 times the volume of patterns in their dictionaries as compared with RAM-based systems, and those dictionaries can persist across power failures. The data, however, is slower to access than it would be with the typical RAM-based implementations, although the performance gains of a disk-based system are likely to more than compensate for this extra delay. While disks are more cost-effective than a RAM-based solution on a per byte basis, given the size of these systems they do add to the overall cost and introduce additional points of failure to a solution. Standard techniques such as RAID can mitigate the risk associated with these points of failure.

Yet another option is that some optimization solutions implement a token-based approach to deduplication while other solutions implement an instruction-based approach. A token-based approach relies on the use of a token to represent chunks of data. In contrast, an instruction-based approach uses specific start-stop instructions to indicate where duplicate data can be found and retrieved.

The difference between these two approaches becomes most apparent when supporting applications such AutoCAD and Microsoft Excel that have dynamic data. These applications scramble the way that data is stored when a file is opened and saved. As a result, even relatively minor changes can result in small changes being made throughout the file. While optimizing applications that have dynamic data presents a challenge to both instruction-based and token-based solutions, some recent test results indicate that dynamic data negatively impacts the performance of instruction-based solutions less than it negatively impacts token-based solutions. For more information on how which optimization techniques are the most effective at ensuring the performance of key traffic categories, see this brief on Webtorials.

The next newsletter will deal with the myth that the application delivery market is consolidating. In the meantime, we would like to hear from you. What has been your experience when dealing with the suppliers of application performance management and WAN acceleration solutions? Have they been straight-forward with their answers or more allusive?

Editor's note: Check out Network World's Buyer's Guide to compare Application Acceleration and WAN Traffic Optimization products.

Read more about lans & wans in Network World's LANs & WANs section.

Steve Taylor is president of Distributed Networking Associates and publisher/editor-in-chief of Webtorials. Jim Metzler is vice president of Ashton, Metzler & Associates.

  • Print

Videos

rssRss Feed