The correct levels of backup save time, bandwidth, space

Backup and recovery done efficiently require use of multiple backup levels carefully thought out so a full backup isn’t needed as often if incremental and differential backups are properly planned.

network technician datacenter
Getty Images

One of the most basic things to understand in backup and recovery is the concept of backup levels and what they mean.

Without a proper understanding of what they are and how they work, companies can adopt bad practices that range from wasted bandwidth and storage to actually missing important data on their backups. Understanding these concepts is also crucial when selecting new data-protection products or services.

Full backup

A full backup contains all data in the entire system. A full backup of the C:\ drive in Windows contains every file on the C: drive. A full backup of a Windows system should contain a copy of every file on every drive on the machine or VM (e.g. C:\, D:\, F:\, etc.). The same goes for a full backup of a UNIX or Linux machine; it contains every file on every file system on the machine (e.g./, /home, /opt, etc.).

The only thing that should be excluded from a full backup are files that were specifically excluded by the configuration. For example, many system administrators choose to exclude directories that will have no value during a restore (e.g. /boot or /dev), or contain transient files (e.g. C:\Windows\TEMP in Windows, or /tmp in Linux).

There are two philosophies when discussing what files should be included or excluded from backup: backup everything and exclude what you know you don't need, or select only what you want to backup. The former is the safer option, the latter will save some space on your backup system. Some people see it as a waste to backup application files, such as the directory into which you have loaded Oracle or SQL Server. They believe they would simply reload the application during a restore. The risk of this approach is that someone will place valuable data in a directory that is not selected for backup. For example, if you select only /home1 or D:\Data to be backed up, how will the backup system know if someone adds /home2 or E:\Data? This is why it is much safer to backup everything and exclude only the files that you know you don't need, even if it does take up some additional space. An exception to this might be if you have a strongly controlled environment where all data is always loaded in the same place, and you have a well orchestrated solution for replacing the operating system and applications in a restore.

Incremental backup

An incremental backup typically backs up all data that has changed since the last backup of any kind. Historically, such backups were file-based backups, meaning that they backed up all files that had changed since the last backup. The challenge with this from a modern data protection standpoint is that we are attempting in every way to minimize the I/O impact of backups on the server (especially when backing up VMs), and backing up a 10 GB file because 1 MB has changed isn't very efficient.

This is why many vendors have switched to block-based incremental based backups, which back up only the blocks that have changed. The most common way to do this is when backup software products are backing up VMware or Hyper-V using their APIs. The app notifies the appropriate API it is doing a block-based incremental, after which it is given a list of blocks to back up.

Differential backup

Although it has meant a few different things over the years, it is now widely accepted that a differential backup will backup all data that has changed since the last full backup. This type of backup was much more in vogue in the days of tape, as it minimized the number of tapes that was required for a restore. A restore needed the latest full, followed by the latest differential, followed by the latest incremental.

If you are still doing tape-based backups, consider this: move from weekly fulls to a monthly full, weekly differential, and daily incremental. A restore will need to load one more backup than it would have needed to load under a weekly

To continue reading this article register now

The 10 most powerful companies in enterprise networking 2022