Am 25.03.2022 um 16:29 schrieb Ilya Dryomov <idryomov@gmail.com>:
On Fri, Mar 25, 2022 at 12:35 PM Peter Lieven <pl@kamp.de> wrote:
Am 24.03.22 um 15:53 schrieb Peter Lieven:
Am 24.03.22 um 12:51 schrieb Peter Lieven:
Hi,
following the thread on the ceph ml about rbd and exclusive locks I was
wondering if we should rewrite the rbd driver to
try to acquire an exclusive lock on open and not magically on the first write.
This way we could directly terminate qemu
if the rbd image is in use like it is done with ordinary image files on a
filesystem for some time now.
Digging a bit into the code and testing it seems that the exclusive-lock on rbd
image is not that exclusive. We can easily start several
qemu instances accessing the same image. Watching the locks on the image, the
lock owner of the one exclusive lock
toggles between the instances. This has potentitial for severe corruption.
I am thinking not of the case where a qemu instance gets killed or host dies. I
am thinking of network issues where the disconnected
old instance of a VM suddenly kicks back in.
However, if I manually acquire an exclusive lock on qemu_rbd_open_image all
other callers get EROFS.
So whats the difference between the exclusive lock that a write request obtains
and an exclusive lock created
by rbd_lock_acquire? From the rbd lock ls perspective they are identical.
Best
Peter
To whom it may concern this is my proof of concept code to add safe locking.
Tested with live migration incl. migration from an old version without the
locking support.
rbd_lock_acquire can take the lock that has been installed on the first write.
Considerations:
1) add switch to enable/disable locking if someone mounts the image multiple
times on purpose (default on?)
2) we might need to check if we already have the lock in bdrv_invalidate_cache
3) I am aware of bdrv_{check,set,abort}_perm. If this are the better hooks how
to use them with only a single lock?
4) rbd_lock_release is automatically invoked on image close.
Happy weekend
Peter
---8<---
diff --git a/block/rbd.c b/block/rbd.c
index 240b267c27..13a597e526 100644
--- a/block/rbd.c
+++ b/block/rbd.c
@@ -1047,6 +1047,24 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict
*options, int flags,
goto failed_open;
}
+ if (!(bdrv_get_flags(bs) & BDRV_O_INACTIVE)) {
+ error_report("qemu_rbd_open: rbd_lock_acquire");
+ r = rbd_lock_acquire(s->image, RBD_LOCK_MODE_EXCLUSIVE);
This introduces a dependency on exclusive-lock image feature. For an image
without it, this call would always fail with EINVAL.
Going beyond that, I'm not sure this patch is needed. --exclusive mapping
option of krbd and rbd-nbd (which is what is essentially being replicated here)
is off by default and hasn't seen much interest among users, at least as far
as I'm aware.
Thanks,
Ilya