VM Snapshot vs Backup
When it comes to quick rollbacks, snapshots on virtual machines are a quick and effective way to roll back to a point in time. Especially when it comes to development environments, VM snapshots are a great way to return to a known point in time. However, many mistakenly view snapshots as a type of “backup” since it allows the return back to a known good point in time. It is dangerous to consider snapshots on a virtual machine to be a type of backup. To explain why that is the case, we will take a look at the technical aspects of a snapshot that lead us to this conclusion. What is VM snapshot and when do we use it? How are snapshots created in VMware and Hyper-V? What are the differences between snapshots and backups in general? Despite using snapshots in certain cases, why do we still need to give proper attention to backups?
VM snapshot preserves the state and data of a virtual machine at a specific point in time.
The state includes the virtual machine’s power state (for example, powered-on, powered-off, suspended). The data includes all of the files that make up the virtual machine. This includes disks, memory, and other devices, such as virtual network interface cards.
When thinking about use cases for virtual machine snapshots, there are several scenarios where we see snapshots used. Many may use VM snapshot in a virtual environment as a quick failsafe roll back point before performing upgrades, changing installed software, uninstalling components, etc. Snapshots are also very useful for development purposes. A VM or set of VMs can have snapshots created for “rinse and repeat” type testing to develop and validate code changes.
NAKIVO Backup & Replication delivers high-end data protection for SMBs and enterprises with multiple backup, replication and recovery features, including VMware Backup, Hyper-V Backup, Office 365 Backup and more.
What does VM snapshot do when it is created?
A snapshot operation in VMware creates the following files:
- .Vmdk – The flat.vmdk file contains the raw data in the base disk.
- -delta.vmdk – The delta disk is represented in the format of .00000x.vmdk. It contains the difference between the current state of the virtual disk and the state that existed at the time that the previous snapshot was taken.
- .vmsd – This file is the database file for the snapshot itself which contains the snapshot information and is the primary source of information for the snapshot manager. The entries contained in this file are the snapshots and relationships between snapshots and child disks for each snapshot.
- .vmsn – The .vmsn file includes the active state of the virtual machine that captures the memory state at the point of the snapshot. This allows you to revert to a running state of the virtual machine when reverted. If you create a snapshot without including the memory, reversion to the snapshot will be to a virtual machine that is turned off.
In the Hyper-V world, VM snapshots or checkpoints are instituted in a similar way and the concepts are the same. Below we can right click on a virtual machine in Hyper-V and choose Checkpoint to initiate the checkpoint (snapshot) creation.
A Snapshots folder is created containing the new binary file format for Windows Server 2016 – VMCX and VMRS.
- VMCX – This file is the binary configuration file that replaces the XML file found in 2012 R2 and earlier.
- VMRS – This file contains various information about the state of the running virtual machine.
Also, a differencing disk is created with the .avhdx format. This records the delta changes that are made post checkpoint creation.
What are the differences between snapshots and backups in general
First of all, it is worth mentioning, that neither VMware or Microsoft with Hyper-V support the idea of snapshots/checkpoints being backups in themselves.
When we compare and contrast the differences between VM snapshot and VM backup we find there are several key points to take into consideration that help us to see snapshots are not backups.
Snapshots, as shown above, are a mechanism to record delta changes from a certain point in time. The files put in place to make snapshots possible are on the same storage infrastructure as the parent disks. When we think about true backups, we want our backups to be completely autonomous from our virtual machine that we are protecting. The delta disks are not autonomous from either the physical infrastructure or the virtual infrastructure of the virtual machine.
Snapshots on their own are not autonomous as they are dependent on the parent VM disks or chain of snapshots if there are multiple in existence. In fact, if the base disks were deleted, the snapshots are not enough on their own to restore a virtual machine. VM backups that are taken and stored using changed block tracking information are safely able to be restored without the need for any dependencies on the actual parent VM files.
Snapshots are not meant to exist long term. With VMware or Hyper-V, snapshots/checkpoints are not meant to linger forever in an environment and can lead to performance issues when left in place.
VMware Best Practices regarding snapshots
With the following best practices regarding snapshots directly from VMware, you can see with the intent and purpose of snapshots, they are not intended to be backups:
- Do not use snapshots as backups.
- VMware recommends only a maximum of 32 snapshots in a chain. However, for better performance, use only 2 to 3 snapshots.
- Do not use a single snapshot for more than 24-72 hours.
- The snapshot file continues to grow in size when it is retained for a longer period. This can cause the snapshot storage location to run out of space and impact the system performance.
Why do we still need to give proper attention to backups?
Backups are an autonomous copy of your data and or virtual machine in general that doesn’t depend on the physical or virtual machine files already in place to be able to restore data (unlike snapshots). They allow the recreation of a VM or data without any reliance on the source virtual machine or files.
Also, today’s modern backup technologies allow us to not only have autonomous backups of our virtual infrastructure but also allow us to replicate virtual machines as well as have backup copies stored offsite as well. With plain virtual machine snapshots, none of this is possible.
Now, with that being said, most modern backup technologies leverage VM snapshot to copy data, but don’t rely on the snapshots remaining in place. The VM snapshot allows the backup software to grab data from the virtual machine and then the temporary snapshots are deleted once a backup cycle is completed.
Backups are an essential part of business continuity allowing Recovery Time Objectives and Recovery Point Objectives, RTO and RPO respectively, to be met. Snapshots do not ensure any of these critical objectives. Any environment that might rely on snapshots as means for backup is asking for disaster and data loss.
Thoughts
Many have learned the hard way that snapshots in a virtual environment are not a reliable means to recover lost data or virtual infrastructure. While they have special use cases and can be used safely within the means of their designed intent and purpose, organizations should always have backups in place to have a standalone mechanism for business continuity.