Backing up a database depends on how it’s delivered

On premises servers, PaaS, and serverless delivery of databases each has its own backup best practices.

Credit: Getty Images

How you back up your database depends three factors: how the database is delivered to you, the backup logistics of the database, and the recovery time objective (RTO) and the recovery point objective (RPO) you are attempting to meet. This article covers how the database is delivered.

There are three ways databases can be delivered: as software on a server you own; as platform-as-a-service (PaaS); and as a serverless service. Let’s take a look at them.

Traditional database software

Until a few years ago the way all databases were delivered was by buying a license for a product and installing it on a server or VM of your choice. You were responsible for everything, including the security and administration of the server, the storage, the application itself, and (of course) the backup of the database.

This means you have a variety of choices to make, including how to back it up. Some possibilities are completely invalid because databases behave in a particular way that makes them not so easy to backup using methods designed for unstructured data. The following three concepts apply to almost all traditionally delivered databases.

Moving target

The data in a database is generally stored in datafiles that you can see in the filesystem of the server or VM that hosts it. These files are constantly changing as long as something is making updates to the database, which means you cannot just back them up like any other file. The backup would be worthless.

Point-in-time backups and restores

Most supported database-backup methods create a copy of the database at the point in time when the copy was made, for example each night at 10 p.m. That means it will only be able to restore the database to that point in time.

Rolling forward or backward from a point in time

To be able to meet a tighter RPO, most databases have a log of transactions that can be replayed after a point-in-time restore to move the point in time up to a more recent point in time that you specify. This log can also be used to roll back transactions if the database crashes and comes up in an inconsistent state.

Those three general concepts are behind almost all databases running on servers or VMs you administer, although there are exceptions to every rule. Datafiles are sometimes block devices and not files at all, and sometimes they do not change even if the database is changing. The key to getting the backups of any traditionally delivered database right lies in understanding how the database solves the above three challenges.

The most typical method of backing up a traditionally delivered database is a nightly copy that may be full or incremental, followed by continuous backup of the transaction logs. The dump will allow you to restore the entire database, then the logs will allow you to roll transactions forward to the point that things went wrong.

Platform-as-a-Service

A second way databases are delivered is the platform-as-a-service (PaaS) model where you see only the application and have limited to no access to the infrastructure behind it. Amazon Relational Database Service (RDS) is an example of a PaaS offering, and it can be configured to provide Oracle, MySQL, PostgreSQL, and MariaDB, and Aurora databases. Azure also offers SQL Server, MySQL, PostgreSQL, and others in a PaaS configuration.

Backup options for a PaaS database are usually pretty straightforward. Each PaaS offering provides a mechanism that supports backup and recovery. Some come with backups that run automatically every day and typically create a copy in that vendor’s object storage. Others require you to configure the backups for them to run. Therefore, do not assume your PaaS database is automatically backed up.

In fact, you shouldn’t assume any of your infrastructure is backed up. Investigate each PaaS database you are using and see what backup and recovery options it offers. Most default backup methods for PaaS databases copy your backups into the same account and region the database is running in, so another thing to look into is whether or not you can copy those backups to a different account and region. This is a good idea that will protect you against things like the OVH fire that destroyed two data centers.

Serverless databases

Serverless databases take PaaS one step farther, removing even more administrative requirements from the customer and creating an easier-to-use experience. AWS DynamoDB, Aurora Serverless, and Azure Cosmos DB are all examples of these databases.

With a serverless database, you don’t have to configure anything. You literally just start putting in data. The compute and storage resources, as well as database partitioning decisions, will automatically be decided and provisioned for you. It’s so “magic” that many people assume backups are automatically handled, but that’s not always the case.

Like PaaS databases, the backup methods are dictated by the vendor offering the database, so the best practice is the same: Investigate the best backup methods for the database you are using and deploy them. Make sure to figure out how to copy data to another region and account.

The cloud is not magic, but it certainly has made backing up and recovering databases significantly easier. The ease with which you can create a backup of an entire database—even a partitioned one residing on hundreds of nodes—is much easier than what’s required when you are managing everything in a traditional environment. Just don’t let things be so easy that you start assuming things. That’s always a recipe for backup disaster.

Data Center

Backing up a database depends on how it’s delivered

On premises servers, PaaS, and serverless delivery of databases each has its own backup best practices.

Traditional database software

Moving target

Point-in-time backups and restores

Rolling forward or backward from a point in time

Platform-as-a-Service

Serverless databases

More from this author

How cybersecurity red teams can boost backup protections

Backup lessons learned from 10 major cloud outages

Can your cloud backup provider fail?

Ransomware increases urgency of documented DR plans

Choosing a disaster recovery site

Is it time to change your backup system?

How to lock down backup infrastructure

How to determine RTOs and RPOs for backup and recovery

Show me more

CISA unveils a six-step blueprint for isolating critical infrastructure during cyberattacks

IBM: AI-driven attacks increased 56% last year, and data breach costs are up 12%

Why the future of network security is the convergence of SASE and firewalls

Has the hype around ‘Internet of Things’ paid off? | Ep. 145

Episode 1: Understanding Cisco’s Converged SDN Transport

Episode 2: Pluggable Optics and the Internet for the Future

Master Linux Math with the bc Command | Easy CLI Calculations Explained!

Master Linux Math in Seconds: How to Use the expr Command Like a Pro

How to Do Math in the Command Line Using Double Parentheses

Backing up a database depends on how it’s delivered

Traditional database software

Moving target

Point-in-time backups and restores

Rolling forward or backward from a point in time

Platform-as-a-Service

Serverless databases

From our editors straight to your inbox

More from this author

How cybersecurity red teams can boost backup protections

Backup lessons learned from 10 major cloud outages

Can your cloud backup provider fail?

Ransomware increases urgency of documented DR plans

Choosing a disaster recovery site

Is it time to change your backup system?

How to lock down backup infrastructure

How to determine RTOs and RPOs for backup and recovery

Show me more

CISA unveils a six-step blueprint for isolating critical infrastructure during cyberattacks

IBM: AI-driven attacks increased 56% last year, and data breach costs are up 12%

Why the future of network security is the convergence of SASE and firewalls

Has the hype around ‘Internet of Things’ paid off? | Ep. 145

Episode 1: Understanding Cisco’s Converged SDN Transport

Episode 2: Pluggable Optics and the Internet for the Future

Master Linux Math with the bc Command | Easy CLI Calculations Explained!

Master Linux Math in Seconds: How to Use the expr Command Like a Pro

How to Do Math in the Command Line Using Double Parentheses