[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC] migration/block-dirty-bitmap: make loading bitmap for device w
From: |
Fiona Ebner |
Subject: |
Re: [RFC] migration/block-dirty-bitmap: make loading bitmap for device with iothread future-proof |
Date: |
Thu, 5 Oct 2023 09:19:55 +0200 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.15.1 |
Am 04.10.23 um 18:13 schrieb Vladimir Sementsov-Ogievskiy:
>> @@ -833,7 +836,10 @@ static int dirty_bitmap_load_start(QEMUFile *f,
>> DBMLoadState *s)
>> bdrv_disable_dirty_bitmap(s->bitmap);
>> if (flags & DIRTY_BITMAP_MIG_START_FLAG_ENABLED) {
>> + AioContext *ctx = bdrv_get_aio_context(s->bs);
>> + aio_context_acquire(ctx);
>> bdrv_dirty_bitmap_create_successor(s->bitmap, &local_err);
>> + aio_context_release(ctx);
>
> Would not this deadlock in current code? When we have the only one aio
> context and therefore we are already in it?
>
Yes, I noticed that myself a bit later ;)
Am 01.08.23 um 09:57 schrieb Fiona Ebner:
> And the patch itself also got an issue. AFAICT, when
> qcow2_co_load_vmstate() is called, we already have acquired the context
> for the drive we load the snapshot from, and since
> polling/AIO_WAIT_WHILE requires that the caller has acquired the context
> exactly once, we'd need to distinguish if the dirty bitmap is for that
> drive (can't acquire the context a second time) or a different drive
> (need to acquire the context for the first time).
Quoted from a reply in this thread
https://lists.nongnu.org/archive/html/qemu-devel/2023-08/msg00007.html
Am 04.10.23 um 18:13 schrieb Vladimir Sementsov-Ogievskiy:
> If as Juan said, we rework incoming migration coroutine to be a separate
> thread, this patch becomes more correct, I think..
Yes, it would become an issue when the function is called from a
non-coroutine context.
> If keep coroutine, I think, we should check are we already in that aio
> context, and if so we should not acquire it.
In coroutine context, we don't need to acquire it, but it shouldn't hurt
either and this approach should work for non-coroutine context too. The
question is if such conditional lock-taking is acceptable (do we already
have something similar somewhere?) or if it can be avoided somehow like
it was preferred in another one of my patches:
Am 05.05.23 um 16:59 schrieb Peter Xu:
> On Fri, May 05, 2023 at 03:46:52PM +0200, Fiona Ebner wrote:
>> To fix it, ensure that the BQL is held during setup. To avoid changing
>> the behavior for migration too, introduce conditionals for the setup
>> callbacks that need the BQL and only take the lock if it's not already
>> held.
>
> The major complexity of this patch is the "conditionally taking" part.
Quoted from
https://lists.nongnu.org/archive/html/qemu-devel/2023-05/msg01514.html