Recoup with data dedupe

Eight products that cut storage costs through data deduplication

Current Job Listings

Backing up servers and workstations to tape can be a cumbersome process, and restoring data from tape even more so. While backing up to disk-based storage is faster and easier, and probably more reliable, it can also be more expensive.

One way to get the best of both worlds is to back up to disk-based storage that uses deduplication, which increases efficiency by only storing one copy of a thing.

While the process was originally used at the file level, many products now work at the block or sub-block (chunk) level, which means that even files that are mostly the same can be deduplicated, saving the space consumed by the parts that are the same.

For instance, say someone opens a document and makes a few changes, then sends the new version to a dozen people. With file-level deduplication, the old and new versions are different files, though only one copy of the new version is stored. With block-level or sub-block-level deduplication, only the first document and the changes between the first document and the second are stored.

The down and dirty of data deduplication

There is some debate about the optimum process - deduplication of files is not very efficient, blocks, more so, chunks even more so. However, the smaller the chunks, the more processing it takes, and the bigger the indices are that keep track of duplicates. Some systems use variable size chunks to tune this, depending on the type of data being stored.

To continue reading this article register now

Now read: Getting grounded in IoT