LVM based backup of virtual machines

LVM is one of the greatest things in linux. So much powerful, elastic and smooth approach to block devices.
You can use it almost everywhere.

For example I use logical volumes as storage for virtual machines. The most important thing is that I can use snapshots to backup my entire VM. Snapshot is copy of logical volume which is consistent in a moment of time (moment of taking it).
So you can use it to backup live system (VM). I even use it for backuping VMs with runing databases. The only thing you should know that when snapshot overflows, it's discraded. So you have to plan how big snapshot should be. You just have to know speed of data growth (or change) on virtual machine, and then adjust size of snapshot to be sure that it won't overflow during backup.

Creating snapshot is easy thing, just:

lvcreate -s -n name_of_snapshot -L size_of_snapshot /dev/vg_name/lv_name

This would create snapshot with name and size you provide of lmv device. It's good to name snapshots similiar to source LVs, just to not get confused.

When you have a snapshot you can watch how fast it grows just by running lvs command.

When you wanna back it up just:

dd if=/dev/vg_name/snapshot_name of=/place/when/you/keep/backups/vm_backup.dd

Using this approach is fast, but takes a lot of space, cause backup would take same amount as source lvm device. So if you wanna save some space, just compress it on the fly:

dd if=/dev/vg_name/snapshot_name | gzip -c | dd of=/place/when/you/keep/backups/vm_backup.gz

How good those backups compress depends on type of data you keep on LV you backuping. Guess compressing luks volume is just a waste of time, but my VMs compress from 10-80%, so it's very nice result.

And what if you have backup space on other machine that one which runs VMs. Just use ssh and store destination file on remote machine:

dd if=/dev/vg_name/snapshot_name | gzip -c | ssh user@remote.machine "dd of=/place/when/you/keep/backups/vm_backup.gz"
or
dd if=/dev/vg_name/snapshot_name | ssh user@remote.machine "gzip -c | dd of=/place/when/you/keep/backups/vm_backup.gz"

First command compresses data on source machine, second on destination one. Sometimes it's better to put more data to network and compress it on destination machine. For example when source machine is cpu critical and you don't want to overload cpu there. Sometimes destination machine just got much more powerful CPU (in my case).

How to recover those backups?

Same way they were created.

dd if=/place/when/you/keep/backups/vm_backup.dd of=/dev/vg_name/destination_lv

If compressed:

dd if=/place/when/you/keep/backups/vm_backup.gz | gzip -c -d | dd of=/dev/vg_name/destination_lv

Of course you shoudn't restore logical volume when VM is running on top of it. I would end up badly ;). But you can snapshot and take backup of running VMs, just use right amount of space for snapshot. This method is working great, just restored xen server with 5 VMs in lower then 3 hours with installing Dom0 from scratch. So it took about 30 minutes for VM. The biggest disadventage of this solution is amount of hdd space taken. It takes a lot compared to file backups, especially with incremental option, but it's much much faster, so it's very useful in mission critical environments. It works both with Xen and KVM based VMs.

I've also wrote small bash cron job, for backing up VMs at night ;).

#!/bin/bash
config="./vm_backup.cfg"
vms=`cat $config | awk '{print $1}' | tr "\n" " "`

for i in ${vms}
do
size=`grep $i $config | awk '{print $2}'`

/usr/sbin/lvcreate -s --size=${size}G -n ${i}_snap /dev/vgvirt1/$i && /bin/dd if=/dev/vgvirt1/${i}_snap bs=16MB | /bin/gzip -c | /bin/dd of=/backup/vms/${i}.`date +%Y-%m-%d-%H.%M.%S`.gz ; /usr/sbin/lvremove -f /dev/vgvirt1/${i}_snap
done

and config looks like this:

example_vm1 0.5
example_vm2 1
example_vm3 4

Where first column is name of LV, second is snapshot size in gigabytes.

Enjoy!

Comments

I like that this is simple, to the point, and works, and doesn't try to do too much. You sir win the internet!

Thank you. That was simle and efficient. In this script you just backup the VM's disk, Now, how to backup VM's configureations like cpu, ram and etc. and the other question is, how to restor these backups?

VM configuration files location depends on platform used. For example RedHat(Centos) with KVM keeps them in /etc/libvirt/qemu as *.xml files. So it's enough to backup those files (gziped tar). VM restoration is in the article, restoration of VM configs is simple. Just put them from your (created above gzip) to proper location. In some cases (KVM) you have to load definition files using virsh (or other VM manager).

Great! Thank you for this solution.! Now I use it for backup kvm virtual machines.

don't forget that vms backed up this way that have mysql databases in them can be inconsistent (i.e the database will be "crashed" and might be non-recoverable). for a fully clean backup, the vm needs to be stopped.

Not really. Of course if database is really busy and got lot of writes it could happen that it went corrupt state (never happened to me). But everyone should also backups databases more often then VM itself (using db related tools - i.e. mysqldump), so you always got the newest backup of db elsewhere. In most cases you restore VM from LVM backup, and then databases from dumps anyway. So there is no concern unless you rely only on LVM VM backup.

Hi,

Awesome write up, best I have found so far.
1 question is what do you have inside your vm_backup.cfg which your script opens?

and config looks like this:

example_vm1 0.5
example_vm2 1
example_vm3 4

There is already new version of script which deletes old snapshots :) so there is another column with days_to_keep :).

Add new comment

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.
CAPTCHA
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.

Main menu

Article | by Dr. Radut