VMWare VI3 Snapshots and Their Misconceptions

Here I talk about the proper best practices for Snapshotting VM’s so that you can minimize downtime in your VI3 infrastructure.

Looking at people's VMWare implementations, the most common problem I see is people who run on snapshots for an extended period of time. And by that I mean months. This can cause downtime issues for you in the future. Here I talk about the proper best practices for Snapshotting VM's so that you can minimize downtime in your VI3 infrastructure.

I suppose we can place some of the blame for these misconceptions on VMWare, since the word "Snapshot" means something totally different in other products, primarily that I can always use some snapshot I made as a restore mechanism. I find so many Admins who think that the snapshot is "a separate copy" of the running version of the VM.

Nothing could be further from the truth. Each Snapshot you make is nothing more than delta changes to the original running VM. This means that the original VM and its ensuing snapshots are all interrelated and should be treated ultimately as one entity.

The Official Definition of a Snapshot from VMWare: A snapshot captures the entire state of the virtual machine at the time you take the snapshot. This includes the memory and disk states as well as the virtual machine settings. When you revert to a snapshot, you return all these items to the state they were in at the time you took that snapshot. (Taken from http://www.vmware.com/pdf/vi_performance_tuning.pdf)

In VI3, you will eventually either need to merge all the changes, or revert to a previous snapshot version (which reverts back to a previous state). Here are some facts about Snapshots:

  1. Snapshots should be used short term, not long term. Why? Because in VI3, while you merge the data of any snapshots, the VM goes offline. And, the more data (which usually means the longer you run on the snapshot), the longer it takes to merge the data, and hence, the longer it's offline. Snapshots are meant to be used for testing apps installations or patches, and if they cause issues, you can always revert back to the original easily. They are not meant to be run long term. I can't tell you how many times I've seen people who have been running their Exchange machine on a snapshot that's been running for 6 months. Merging that will take quite some time.

  2. Reverting to a snapshot, or the original VM, is a true backoff mechanism. Meaning, if you revert to a previous snapshot, or the original VM, you are not simply uninstalling any software since or deleting the data that has been stored in the VM since; rather, it's like it never happened. Think of it like going back in time. This can be a good thing in Windows for example, because we all know what happens with an un-installation; you get registry remnants and the like. You don't get this when you revert to a previous snapshot. Once you revert, those changes are gone forever, so be sure a reversion is what you want to do.

  3. Snapshots should not be used as a backup/restore mechanism. If the VM gets corrupted, the entire VM is affected in most cases. Plus, the snapshot files are stored within the same directory as the original VM, so if the physical disk goes bad, all files are affected. In the event of a VM corruption or Hard Drive failure, you will still need to restore it in some fashion from separate media.

  4. Snapshots can cause issues with VMotion. It is suggested that you merge all snapshots (or get rid of them) before a VM can be moved.

If you are using vSphere, based on the tests I have run, the merging does not have the same "offline" consequences because vSphere is able to leverage Volume Shadow Services to manipulate Snapshots. However, the recommendation that you not use snapshots as a backup/recovery solution still stands.

Hope this helps.

Join the Network World communities on Facebook and LinkedIn to comment on topics that are top of mind.
Related:

Copyright © 2009 IDG Communications, Inc.