On Tue, 12/22 17:46, Kevin Wolf wrote:
Enough innocent images have died because users called 'qemu-img snapshot' while
the VM was still running. Educating the users doesn't seem to be a working
strategy, so this series adds locking to qcow2 that refuses to access the image
read-write from two processes.
Eric, this will require a libvirt update to deal with qemu crashes which leave
locked images behind. The simplest thinkable way would be to unconditionally
override the lock in libvirt whenever the option is present. In that case,
libvirt VMs would be protected against concurrent non-libvirt accesses, but not
the other way round. If you want more than that, libvirt would have to check
somehow if it was its own VM that used the image and left the lock behind. I
imagine that can't be too hard either.
The motivation is great, but I'm not sure I like the side-effect that an
unclean shutdown will require a "forced" open, because it makes using qcow2 in
development cumbersome, and like you said, management/user also needs to handle
this explicitly. This is a bit of a personal preference, but it's strong enough
that I want to speak up.
As an alternative, can we introduce .bdrv_flock() in protocol drivers, with
similar semantics to flock(2) or lockf(3)? That way all formats can benefit,
and a program crash will automatically drop the lock.
Fam