libvirt sanlock integration in openSUSE Factory

A few weeks back I found some time to package sanlock for openSUSE Factory, which subsequently allowed enabling the libvirt sanlock driver.  And how might this be useful?  When running qemu/kvm virtual machines on a pool of hosts that are not cluster-aware, it may be possible to start a virtual machine on more than one host, potentially corrupting the guest filesystem.  To prevent such an unpleasant scenario, libvirt+sanlock can be used to protect the virtual machine’s disk images, ensuring we never have two qemu/kvm processes writing to an image concurrently.  libvirt+sanlock provides protection against starting the same virtual machine on different hosts, or adding the same disk to different virtual machines.

In this blog post I’ll describe how to install and configure sanlock and the libvirt sanlock plugin.  I’ll briefly cover lockspace and resource creation, and show some examples of specifying disk leases in libvirt, but users should become familiar with the wdmd (watchdog multiplexing daemon) and sanlock man pages, as well as the lease element specification in libvirt domainXML.  I’ve used SLES11 SP2 hosts and guests for this example, but have also tested a similar configuration on openSUSE 12.1.

The sanlock and sanlock-enabled libvirt packages can be retrieved from a Factory repository or a repository from the OBS Virtualization project.  (As a side note, for those that didn’t know, Virtualization is the development project for virtualization-related packages in Factory.  Packages are built, tested, and staged in this project before submitting to Factory.)

After configuring the appropriate repository for the target host, update libvirt and install sanlock and libvirt-lock-sanlock.
# zypper up libvirt libvirt-client libvirt-python
# zypper in sanlock libsanlock1 libvirt-lock-sanlock

Enable watchdog daemon and sanlock daemons.
# insserv wdmd
# insserv sanlock

Specify the sanlock lock manager in /etc/libvirt/qemu.conf.
lock_manager = “sanlock”

The suggested libvirt sanlock configuration uses NFS for shared lock space storage.  Mount a share at the default mount point.
# mount -t nfs nfs-server:/export/path /var/lib/libvirt/sanlock

These installation steps need to be performed on each host participating in the sanlock-protected environment.

libvirt provides two modes for configuring sanlock.  The default mode requires a user or management application to manually define the sanlock lockspace and resource leases, and then describe those leases with a lease element in the virtual machine XML configuration.  libvirt also supports an auto disk lease mode, where libvirt will automatically create a lockspace and lease for each fully qualified disk path in the virtual machine XML configuration.  The latter mode removes the administrator burden of configuring lockspaces and leases, but only works if the administrator can ensure stable and unique disk paths across all participating hosts.  I’ll describe both modes here, starting with the manual configuration.

Manual Configuration:
First we need to reserve and initialize host_id leases.  Each host that wants to participate in the sanlock-enabled environment must first acquire a lease on its host_id number within the lockspace.  The lockspace requirements for 2000 leases (2000 possible host_id’s) is 1MB (8MB for 4k sectors).  On one host, create a 1M lockspace file in the default lease directory (/var/lib/libvirt/sanlock/).
# truncate -s 1M /var/lib/libvirt/sanlock/TEST_LS

And then initialize the lockspace for storing host_id leases.
# sanlock direct init -s TEST_LS:0:/var/lib/libvirt/sanlock/TEST_LS:0

On each participating host, start the watchdog and sanlock daemons and restart libvirtd.
# rcwdmd start; rcsanlock start; rclibvirtd restart

On each participating host, we’ll need to tell the sanlock daemon to acquire its host_id in the lockspace, which will subsequently allow resources to be acquired in the lockspace.
# sanlock client add_lockspace -s TEST_LS:1:var/lib/libvirt/sanlock/TEST_LS:0
# sanlock client add_lockspace -s TEST_LS:2:var/lib/libvirt/sanlock/TEST_LS:0
# sanlock client add_lockspace -s TEST_LS:<hostidN>:var/lib/libvirt/sanlock/TEST_LS:0

To see the state of host_id leases read during the last renewal
# sanlock client host_status -s TEST_LS
1 timestamp 50766
2 timestamp 327323

Now that we have the hosts configured, time to move on to configuring a virtual machine resource lease and defining it in the virtual machine XML configuration.  First we need to reserve and initialize a resource lease for the virtual machine disk image.
# truncate -s 1M /var/lib/libvirt/sanlock/sles11sp2-disk-resource-lock
# sanlock direct init -r TEST_LS:sles11sp2-disk-resource-lock:/var/lib/libvirt/sanlock/sles11sp2-disk-resource-lock:0

Then add the lease information to the virtual machine XML configuration
# virsh edit sles11sp2

<target path=’/var/lib/libvirt/sanlock/sles11sp2-disk-resource-lock’/>

Finally, start the virtual machine!
# virsh start sles11sp2
Domain sles11sp2 started

Trying to start same virtual machine on different host will fail since the resource lock is already leased to another host
other-host:~ # virsh start sles11sp2
error: Failed to start domain sles11sp2
error: internal error Failed to acquire lock: error -243

Automatic disk lease configuration:
As can be seen even with the trivial example above, manual disk lease configuration puts quite a burden on the user, particularly in an adhoc environment with only a few hosts and no central management service to coordinate all of the lockspace and resource configuration.  To ease this burden, Daniel Berrange adding support in libvirt for automatically creating sanlock disk leases.  Once the environment is configured for automatic disk leases, libvirt will handle the details of creating lockspace and resource leases.

On each participating host, edit /etc/libvirt/qemu-sanlock.conf, setting auto_disk_leases to 1 and assigning a unique host_id.
auto_disk_leases = 1
host_id = 1

Then restart libvirtd
# rclibvirtd restart

Now libvirtd+sanlock is configured to automatically acquire a resource lease for each virtual machine disk.  No lease configuration is required in the virtual machine XML configuration.  We can simply start the virtual machine and libvirt will handle all the details for us.

host1 # virsh start sles11sp2
Domain sles11sp2 started

libvirt creates a host lease lockspace named __LIBVIRT__DISKS__.  Disk resource leases are named using the MD5 checksum of the fully qualified disk path.  After staring the above virtual machine, the lease directory contained
host1 # ls -l /var/lib/libvirt/sanlock/
total 2064
-rw——-  1 root root 1048576 Mar 13 01:35 3ab0d33a35403d03e3ad10b485c7b593
-rw——-  1 root root 1048576 Mar 13 01:35 __LIBVIRT__DISKS__

Finally, try to start the virtual machine on another participating host
host2 # virsh start sles11sp2
error: Failed to start domain sles11sp2
error: internal error Failed to acquire lock: error -243

Feel free to try the sanlock and sanlock-enabled libvirt packages from openSUSE Factory or our OBS Virtualization project. One thing to keep in mind is that the sanlock daemon protects resources for some process, in this case qemu/kvm.  If the sanlock daemon is terminated, it can no longer protect those resources and kills the processes for which it holds leases.  In other words, restarting the sanlock daemon will terminate your virtual machines!  If the sanlock daemon is SIGKILL’ed, then the watchdog daemon intervenes by resetting the entire host.  With this in mind, it would be wise to consider an appropriate disk cache mode such as ‘none’ or ‘writethrough’ to improve the integrity of your disk images in the event of a mass virtual machine kill off.