[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 00/16] AioContext fine-grained locking, part 1 o
From: |
Paolo Bonzini |
Subject: |
Re: [Qemu-devel] [PATCH 00/16] AioContext fine-grained locking, part 1 of 3, including bdrv_drain rewrite |
Date: |
Tue, 8 Mar 2016 18:51:00 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 |
On 16/02/2016 18:56, Paolo Bonzini wrote:
> So the fine-grained locking series has grown from 2 parts to 3...
>
> This first part stops where we remove RFifoLock.
>
> In the previous submission, the bulk of the series took care of
> associating an AioContext to a thread, so that aio_poll could run event
> handlers only on that thread. This was done to fix a race in bdrv_drain.
> There were two disadvantages:
>
> 1) reporting progress from aio_poll to the main thread was Really Hard.
> Despite being relatively sure of the correctness---also thanks to formal
> models---it's not the kind of code that I'd lightly donate to posterity.
>
> 2) the strict dependency between bdrv_drain and a single AioContext
> would have been bad for multiqueue.
>
> Instead, this series does the same trick (do not run dataplane event handlers
> from the main thread) but reports progress directly at the BlockDriverState
> level.
>
> To do so, the mechanism to track in-flight requests is changed to a
> simple counter. This is more flexible than BdrvTrackedRequest, and lets
> the block/io.c code track precisely the initiation and completion of the
> requests. In particular, the counter remains nonzero when a bottom half
> is required to "really" complete the request. bdrv_drain doesn't rely
> anymore on a non-blocking aio_poll to execute those bottom halves.
>
> It is also more efficient; while BdrvTrackedRequest has to remain
> for request serialisation, with this series we can in principle make
> BdrvTrackedRequest optional and enable it only when something (automatic
> RMW or copy-on-read) requires request serialisation.
>
> In general this flows nicely, the only snag being patch 5. The problem
> here is that mirror is calling bdrv_drain from an AIO callback, which
> used to be okay (because bdrv_drain more or less tried to guess when
> all AIO callbacks were done) but now causes a deadlock (because bdrv_drain
> really checks for AIO callbacks to be complete). To add to the complication,
> the bdrv_drain happens far away from the AIO callback, in the coroutine that
> the AIO callback enters. The situation here is admittedly underdefined,
> and Stefan pointed out that the same solution is found in many other places
> in the QEMU block layer. Therefore I think it's acceptable. I'm pointing
> it out explicitly though, because (together with patch 8) it is an example
> of the issues caused by the new, stricter definition of bdrv_drain.
>
> Patches 1-7 organize bdrv_drain into separate phases, with a well-defined
> order between the BDS and it children. The phases are: 1) disable
> throttling; 2) disable io_plug; 3) call bdrv_drain callbacks; 4) wait
> for I/O to finish; 5) re-enable io_plug and throttling. Previously,
> throttling of throttling and io_plug were handled by bdrv_flush_io_queue,
> which was repeatedly called as part of the I/O wait loop. This part of
> the series removes bdrv_flush_io_queue.
>
> Patch 8-11 replace aio_poll with bdrv_drain so that the work done
> so far applies to all former callers of aio_poll.
>
> Patches 12-13 introduce the logic that lets the main thread wait remotely
> for an iothread to drain the BDS. Patches 14-16 then achieve the purpose
> of this series, removing the long-running AioContext critical section
> from iothread_run and removing RFifoLock.
>
> The series passes check-block.sh with a few fixes applied for pre-existing
> bugs:
>
> vl: fix migration from prelaunch state
> migration: fix incorrect memory_global_dirty_log_start outside BQL
> qed: fix bdrv_qed_drain
>
> Paolo
>
> Paolo Bonzini (16):
> block: make bdrv_start_throttled_reqs return void
> block: move restarting of throttled reqs to block/throttle-groups.c
> block: introduce bdrv_no_throttling_begin/end
> block: plug whole tree at once, introduce bdrv_io_unplugged_begin/end
> mirror: use bottom half to re-enter coroutine
> block: add BDS field to count in-flight requests
> block: change drain to look only at one child at a time
> blockjob: introduce .drain callback for jobs
> block: wait for all pending I/O when doing synchronous requests
> nfs: replace aio_poll with bdrv_drain
> sheepdog: disable dataplane
> aio: introduce aio_context_in_iothread
> block: only call aio_poll from iothread
> iothread: release AioContext around aio_poll
> qemu-thread: introduce QemuRecMutex
> aio: convert from RFifoLock to QemuRecMutex
>
> async.c | 28 +---
> block.c | 4 +-
> block/backup.c | 7 +
> block/io.c | 281
> +++++++++++++++++++++++++---------------
> block/linux-aio.c | 13 +-
> block/mirror.c | 37 +++++-
> block/nfs.c | 50 ++++---
> block/qed-table.c | 16 +--
> block/qed.c | 4 +-
> block/raw-aio.h | 2 +-
> block/raw-posix.c | 16 +--
> block/sheepdog.c | 19 +++
> block/throttle-groups.c | 19 +++
> blockjob.c | 16 ++-
> docs/multiple-iothreads.txt | 40 +++---
> include/block/aio.h | 13 +-
> include/block/block.h | 3 +-
> include/block/block_int.h | 22 +++-
> include/block/blockjob.h | 7 +
> include/block/throttle-groups.h | 1 +
> include/qemu/rfifolock.h | 54 --------
> include/qemu/thread-posix.h | 6 +
> include/qemu/thread-win32.h | 10 ++
> include/qemu/thread.h | 3 +
> iothread.c | 20 +--
> stubs/Makefile.objs | 1 +
> stubs/iothread.c | 8 ++
> tests/.gitignore | 1 -
> tests/Makefile | 2 -
> tests/qemu-iotests/060 | 8 +-
> tests/qemu-iotests/060.out | 4 +-
> tests/test-aio.c | 22 ++--
> tests/test-rfifolock.c | 91 -------------
> util/Makefile.objs | 1 -
> util/qemu-thread-posix.c | 13 ++
> util/qemu-thread-win32.c | 25 ++++
> util/rfifolock.c | 78 -----------
> 37 files changed, 471 insertions(+), 474 deletions(-)
> delete mode 100644 include/qemu/rfifolock.h
> create mode 100644 stubs/iothread.c
> delete mode 100644 tests/test-rfifolock.c
> delete mode 100644 util/rfifolock.c
>
Ping? It's been almost a month and I still have one series to send
before hard freeze...
Paolo
- Re: [Qemu-devel] [PATCH 00/16] AioContext fine-grained locking, part 1 of 3, including bdrv_drain rewrite,
Paolo Bonzini <=
- Re: [Qemu-devel] [PATCH 00/16] AioContext fine-grained locking, part 1 of 3, including bdrv_drain rewrite, Fam Zheng, 2016/03/09
- Re: [Qemu-devel] [PATCH 00/16] AioContext fine-grained locking, part 1 of 3, including bdrv_drain rewrite, Stefan Hajnoczi, 2016/03/16
- Re: [Qemu-devel] [PATCH 00/16] AioContext fine-grained locking, part 1 of 3, including bdrv_drain rewrite, Paolo Bonzini, 2016/03/16
- Re: [Qemu-devel] [Qemu-block] [PATCH 00/16] AioContext fine-grained locking, part 1 of 3, including bdrv_drain rewrite, Stefan Hajnoczi, 2016/03/17
- Re: [Qemu-devel] [Qemu-block] [PATCH 00/16] AioContext fine-grained locking, part 1 of 3, including bdrv_drain rewrite, Paolo Bonzini, 2016/03/17
- Re: [Qemu-devel] [Qemu-block] [PATCH 00/16] AioContext fine-grained locking, part 1 of 3, including bdrv_drain rewrite, Stefan Hajnoczi, 2016/03/18