Best backup for 7 major databases

There are many backup options for databases, but here are recommendations for Cassandra, DB2, DynamoDB, MongoDB, MySQL, Oracle, and PostgreSQL.

networking abstract

There are many options for backing up databases, and what’s best varies from database to database and how it’s delivered. Here are recommendations for seven of them, with a glimpse at how the options were chosen to help inform your decision making.

Oracle

Oracle has many options for backup, but the official answer for backing Oracle would be Recovery Manager, or RMAN, which is also the name of the actual command that invokes it. Among many options, RMAN supports an image option that can merge older incremental backups into full backups, which would give you multiple recovery points without having to make multiple full backups. That’s an efficient dump and sweep option, but challenge is you need enough disk space to store a full backup and a series of incrementals. If you’re short on disk space, you can also use the SQL command alter database begin backup before you back up and alter database end backup when you’re done. This will allow you to use whatever backup method you choose. Oracle on Windows also integrates with Volume Shadow Copy Services (VSS), allowing you to perform hot backups without having to script at all. The RMAN image option with a dump and sweep approach offers the best recovery speeds.

SQL Server

The backup database command can automate full or incremental backups of the database or its transaction logs to disk (for a dump and sweep), to Azure (for cloud backup), or to stream them directly to a third-party backup tool.  Like Oracle, SQL Server can also be backed up hot using VSS on Windows. The VSS method is more easily integrated into VM backups and doesn’t require a disk staging area. The method that most DBAs seem to prefer is the dump and sweep approach.

DB2

The backup database command can be used to perform full or incremental backups of all or part of a DB2 database and its transaction logs to disk for a dump and sweep or stream them directly to a third-party backup tool. The command also supports a snapshot flag that integrates with storage devices, and can be used to create an application-consistent snapshot of a database. Like other snapshots, you should replicate that snapshot somewhere else once it is created. It provides great recovery speed with little upfront effort.

MySQL

The first thing to do with MySQL is to stop using MyISAM tables, and get all data that matters onto a more modern table type, such as InnoDB. Customers of MySQL Enterprise Edition can then use the Enterprise Backup feature that allows them to place the database in a hot backup mode before backing up its datafiles (like Oracle). The mysqldump command is more universal and is available to all customers, but it is quite slow for large databases. Regardless, it’s the best way to go if you don’t have access to the Enterprise Backup feature. 

PostgreSQL

The usual method with PostgreSQL is the pg_dump command that creates a full SQL dump to disk for a dump and sweep setup, but it can only recover to when you took the backup. If you’d like a tighter recovery point objective (RPO) than your pg_dump frequency, you should enable the continuous write-ahead log (WAL), then put the database into and out of hot backup mode using pg_start_backup and pg_stop_backup before and after a filesystem backup. During a recovery, you start by restoring the filesystem, then you use the WAL files to replay transactions that occurred after the backup. This option is both efficient and mirrors the tight RPO recovery options of more expensive databases.

MongoDB

If you are using MongoDB Atlas (the PaaS version of the product), the best backup method would be continuous cloud backup with point-in-time recovery. You specify how far back you want to be able to recover, and Atlas automatically performs the backups it needs to be able to restore your cluster to any point in time within the window you specified. If you’re hosting your own MongoDB installation, you can use MongoDB Cloud Manager (SaaS) or Ops Manager (on-prem software). Cloud Manager supports various SLAs at different price points and seems the obvious choice if you are not using Atlas. If you’re running MongoDB 4.2 or greater, you can also just back up the underlying filesystem while mongod is running, and MongoDB will detect “dirty” keys after a restore and address them. The mongodump option is also available to use in a dump and sweep fashion, but it is only recommended for small deployments, because mongodump cannot guarantee the atomicity of transactions across shards.

Cassandra

Your backup options with Cassandra vary widely based on whether you are using the open-source version, the Enterprise version, or Astra (PaaS).  Remember not to confuse the resilience of Cassandra with backup; resiliency will not protect you if someone accidentally drops or truncates a table. The typical backup tool in Cassandra is called nodetool snapshot, which is a snapshot-like system that creates an entirely separate copy of all the data of all or certain keyspaces or tables using hard links. It can create a full or incremental backup. This copy can then be used in a snap-and-sweep configuration to get this data elsewhere. Customers running any version of Cassandra on a cloud vendor can create a cloud snapshot of their entire cluster at once. This provides much better restore options than what can be done with the nodetool snapshot command. Datastax also has an offering for enterprise users called DSE Backup and Restore Service designed to back up and restore the entire cluster. Datastax Astra customers are automatically backed up every four hours.

Backing up DynamoDB

Since DynamoDB is only offered as an AWS service, the choices for backup are relatively straightforward. AWS provides automated backups that support point-in-time recovery of all or part of your table in DynamoDB. All you have to do is enable point-in-time recovery, and AWS will manage everything for you. AWS also offers on-demand backups that you can control, but they do not offer point-in-time restore.

Remember: No matter what method you choose, test recoveries using your backup method. No one cares if you can back up; they only care if you can restore.

Copyright © 2022 IDG Communications, Inc.

The 10 most powerful companies in enterprise networking 2022