[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] block: Fix hangs in synchronous APIs with iothr
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [PATCH] block: Fix hangs in synchronous APIs with iothreads |
Date: |
Tue, 8 Jan 2019 16:27:06 +0000 |
User-agent: |
Mutt/1.10.1 (2018-07-13) |
On Mon, Jan 07, 2019 at 01:02:48PM +0100, Kevin Wolf wrote:
> In the block layer, synchronous APIs are often implemented by creating a
> coroutine that calls the asynchronous coroutine-based implementation and
> then waiting for completion with BDRV_POLL_WHILE().
>
> For this to work with iothreads (more specifically, when the synchronous
> API is called in a thread that is not the home thread of the block
> device, so that the coroutine will run in a different thread), we must
> make sure to call aio_wait_kick() at the end of the operation. Many
> places are missing this, so that BDRV_POLL_WHILE() keeps hanging even if
> the condition has long become false.
>
> Note that bdrv_dec_in_flight() involves an aio_wait_kick() call. This
> corresponds to the BDRV_POLL_WHILE() in the drain functions, but it is
> generally not enough for most other operations because they haven't set
> the return value in the coroutine entry stub yet. To avoid race
> conditions there, we need to kick after setting the return value.
>
> The race window is small enough that the problem doesn't usually surface
> in the common path. However, it does surface and causes easily
> reproducible hangs if the operation can return early before even calling
> bdrv_inc/dec_in_flight, which many of them do (trivial error or no-op
> success paths).
>
> The bug in bdrv_truncate(), bdrv_check() and bdrv_invalidate_cache() is
> slightly different: These functions even neglected to schedule the
> coroutine in the home thread of the node. This avoids the hang, but is
> obviously wrong, too. Fix those to schedule the coroutine in the right
> AioContext in addition to adding aio_wait_kick() calls.
>
> Cc: address@hidden
> Signed-off-by: Kevin Wolf <address@hidden>
> ---
> block.c | 6 +-
> block/block-backend.c | 5 +
> block/io.c | 8 +-
> block/nbd-client.c | 1 +
> block/nvme.c | 1 +
> block/qcow2.c | 1 +
> block/qed.c | 1 +
> tests/test-block-iothread.c | 372 ++++++++++++++++++++++++++++++++++++
> tests/Makefile.include | 2 +
> 9 files changed, 394 insertions(+), 3 deletions(-)
> create mode 100644 tests/test-block-iothread.c
Reviewed-by: Stefan Hajnoczi <address@hidden>
signature.asc
Description: PGP signature