Data storage gains relief from deduplication technology

With storage space so tight, a new data-pruning technique has become all the rage.

Battling armies of cloned files that bog down enterprise storage operations, new data-deduplication techniques rid systems of extraneous versions of the same information - a powerful promise that is causing a stir among enterprise IT buyers.

Deployed by backup clients or software agents onto servers, desktops or laptops, data-deduplication features make use of algorithms or object-oriented processes to home in on redundant data segments.

Buzz Box

Data deduplication and you

Before you decide if this storage management technology is right for your enterprise …


Is it important to deduplicate data immediately or could the process be completed at a later time?

Does my organization want to pursue a disk-to-disk strategy or other means of augmenting tape?

Do I have back-end capacity constraints or am I lacking enough disk to house all of my data?


Where does deduplication take place: at the client before sending the data, at the disk device after the data is sent or on the virtual tape library as a process?

Can your solution handle all of my backup streams?

Does your system support standards such as the IETF’s Session Initiation Protocol?

If deduplication will be performed

at the client side, can your product ingest data fast enough to keep up with high-transaction systems?

What methods are used to identify duplicate information and how does the product guarantee it will not falsely label the data?

Source: David Russell, Gartner vice president

Repeated data copies that bloat total storage volume 10 to 20 times more than necessary are stomped out, freeing up gobs of extra storage space.

Data deduplication's staying power seems virtually guaranteed, because storage space is at an all-time premium. Lawyers, government regulators and corporate leaders are breathing down the necks of IT managers, who are loath to scrap any information for fear the move will haunt them in a future lawsuit or audit.

Shrinking the size of the stored data volumes seems to be one of the few options left.

"There really is a lot of buzz around deduplication. At the same time, it is a technology that is here to stay, because the benefits are so powerful," says David Russell, Gartner vice president, storage and strategies.

Data deduping for SOX

Data deduplication proved more than a buzzword for Vaalco Energy, a Houston company that harvests and processes crude oil and natural gas. "The technology satisfied a real-world need for us," says Dereck Stubbs, Vaalco IT specialist.

Vaalco's very real need for data deduplication centered on a Sarbanes-Oxley Act (SOX) financial audit that came barreling at the company last year. Vaalco had to prove quickly that its backup and recovery procedures met the statute's stringent requirements. It turned to Asigra, which packs data-deduplication functionality into its Televaulting software.

"We needed a solution in days, since the audit was going to come up in a couple of weeks," recalls Robert Walston, Vaalco IT and purchasing supervisor. "The e-mail requirements were especially tricky, since we had retention requirements to hold e-mail 'X' amount of years. There is some duplication when you are going back that far," he says.

Although Vaalco initially hit on deduplication in its scramble to satisfy e-mail retention mandates tied to SOX, the company quickly found greater benefits. It reduced data volumes to the point where formal off-site storage became unnecessary, and that gave the company peace of mind, Stubbs says.

Deduplication's role in enterprise efforts to avoid tape backup and off-site storage has many companies interested in the technology, says Heidi Biggar, an analyst at Enterprise Strategy Group. "If you free up more storage capacity, you could choose to keep data in-line. It is a powerful technology," she says.

Data dedupe cracks the case

The power of deduplication for law firm Winthrop & Weinstine was in the new storage avenues the technology afforded. "By reducing backup data set volume as much as 20 times, deduplication makes disk-based backup cost effective and [opens] an entirely new set of options," says Craig Wilson, IS manager.

Using backup and recovery appliances from Data Domain, the Minneapolis law firm replicated data to remote sites. "Backup data, by its sheer size, is immobile. It can't be sent via secure WAN to remote sites for disaster-recovery purposes," Wilson says. Along with disaster-recovery improvements, other savings materialized through the company's use of the Data Domain deduplication features. For example, the firm reduced costs and liabilities associated with third-party handling of backup tapes.

Data deduplication also solves many remote-office storage problems, says Curtis Damhof, a network manager at St. Peter's Hospital in Albany, N.Y., which makes use of data deduplication features in Avamar Technologies' Axion software.

"We currently back up our remote sites to our main office due to the efficiency provided by the deduplication technology. Another place we have been looking at using the product is in the backup of all our desktops and mobile users," Damhof says.

Other vendors offering data deduplication features in product sets include Diligent Technologies, Exagrid Systems, FalconStor Software and Sepaton. Larger vendors such as Network Appliance and Symantec also are jumping into the mix, proving that deduplication has won a place in the storage market, Gartner's Russell adds. Pricing varies by vendor. For example, the Avamar software costs about $9,000 per terabyte, and Data Domain's appliance and gateways are priced from $19,000 to $105,000, he says.

"Interest in data deduplication has really heated up this year, especially over the summer. There has been a bit of an educational process underway, but the technology is really reaching critical mass," Russell says. "It is no longer something on the fringe, since there are enough deployments for enterprise users to now have a higher level of confidence."

McAdams is a freelance in Vienna, Va. She can be reached at

< Previous story: The ESB: Driving the SOA into the enterprise | Next story: A dashboard that unites e-mail, IM and more >

Learn more about this topic

Special Focus on data deduplication


How to manage growing storage networks


Managed storage comes of age


Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.

Copyright © 2006 IDG Communications, Inc.