Saving space with de-duplication
The newest alternative for saving space
Storage Alert
By
Mike Karp
,
Network World
, 11/02/2006
Sign up for this newsletter now!
Storage analyst Deni Connor focuses on storage, application and infrastructure management in this twice-weekly newsletter.
- Share/Email
- Tweet This
- Print
With the steadily increasing amounts of data that reside on enterprise systems, it’s little wonder that we look for any help
we can get to tame the data beast.
For years now, when moving old data off to tape, or when transmitting data across a network, we have been able to take advantage
of various compression techniques that reduce the number of bits required to save or move the data. Also, for almost 20 years
now we have been able to compress file formats with tools such as ZIP and ARC.
Data compression of course has at least one drawback - the compression and decompression necessary as a front end to a read
or write process impose a performance penalty on the system, so each decision to do compression becomes something of a tradeoff
between performance and space savings.
The newest alternative for saving space these days is data de-duplication, which is a wholly different approach than compression.
De-duplication reduces storage by ensuring that only a single instance of each file (or, with some implementations, a file
block) exists. Redundant files and blocks are never written to disk, but are replaced by reference pointers, allowing blocks
- or even whole files - to be represented by just a few bytes of information.
De-duplication technology is sold as a part of several companies’ product sets, often appearing as an important feature in
a virtual tape library (VTL) or continuous data protection (CDP) solution. I am unaware of anyone offering a stand-alone data
de-duplication product, although there is no reason why that couldn’t be done.
Smaller companies have taken the lead when it comes to de-duping. Asigra, Diligent and Permabit, for example, have de-duplication
as a part of their VTL or CDP offerings, and Mimosa provides it with its NearPoint e-mail archiving product. (Diligent CEO
Doron Kempel discusses de-duplication on Network World's Hot Seat.)
The value of de-duplication is easy to measure. Effectively reducing the total amount of data you have to manage means that,
for a while at least, you will be able to defer some of the knee-jerk storage purchasing that many of you seem to do.
If you plan on your site’s data growing at 25% per quarter and are presently buying accordingly, that probably means the reset
of your hardware storage purchases can be deferred for at least two quarters. If, as Diligent claims, they can reduce the
total data a company must manage by as much as 75%, you might extend this to three quarters. This can be a significant savings
indeed - perhaps with very short term payback - and leaves you with more easily managed storage over the long run.
Deni Connor is principal analyst for Storage Strategies NOW.
Partner Content
www.bmc.com
Gartner 2009 Magic Quadrant for Job Scheduling
Gartner has positioned BMC CONTROL-M in the Leaders Quadrant of their "2009 Magic Quadrant for Job Scheduling." The report assesses the ability to execute and completeness of vision of key vendors in the marketplace. Read a full copy today, courtesy of BMC Software.
Download whitepaper
Dell's SMART Approach to Workload Automation
Read a compelling case study by EMA, Inc. to learn how Dell uses BMC CONTROL-M to cut cost and increase productivity with workload automation.
Download whitepaper
Workload Automation Cost Savings 2 Minute Video
A major computer manufacturer uses BMC CONTROL-M and just four people to schedule and run over 85,000 jobs every month. By switching to BMC CONTROL-M, they more than quadrupled the workload without adding a single staff member. See how in this 2-minute video overview.
Go to video
Comment