Re: [Qemu-devel] [RFC 0/5] nbd: Adapt for dataplane

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [RFC 0/5] nbd: Adapt for dataplane

From:	Stefan Hajnoczi
Subject:	Re: [Qemu-devel] [RFC 0/5] nbd: Adapt for dataplane
Date:	Tue, 3 Jun 2014 16:38:00 +0200
User-agent:	Mutt/1.5.23 (2014-03-12)

On Sat, May 31, 2014 at 08:43:07PM +0200, Max Reitz wrote:
> For the NBD server to work with dataplane, it needs to correctly access
> the exported BDS. It makes the most sense to run both in the same
> AioContext, therefore this series implements methods for tracking a
> BDS's AioContext and makes NBD make use of this for keeping the clients
> connected to that BDS in the same AioContext.
> 
> The reason this is an RFC and not a PATCH is my inexperience with AIO,
> coroutines and the like. Also, I'm not sure about what to do about the
> coroutines. The NBD server has up to two coroutines per client: One for
> receiving and one for sending. Theoretically, both have to be
> "transferred" to the new AioContext if it is changed; however, as far as
> I see it, coroutines are not really bound to an AioContext, they are
> simply run in the AioContext entering them. Therefore, I think a
> transfer is unnecessary. All coroutines are entered from nbd_read() and
> nbd_restart_write(), both of which are AIO routines registered via
> aio_set_fd_handler2().
> 
> As bs_aio_detach() unregisters all of these routines, the coroutines can
> no longer be entered, but only after bs_aio_attach() is called again.
> Then, when they are called, they will enter the coroutines in the new
> AioContext. Therefore, I think an explicit transfer unnecessary.

This reasoning sounds correct.

> However, if bs_aio_detach() is called from a different thread than the
> old AioContext is running in, we may still have coroutines running for
> which we should wait before returning from bs_aio_detach().

The bdrv_attach/detach_aio_context() APIs have rules regarding where
these functions are called from:

  /**
   * bdrv_set_aio_context:
   *
   * Changes the #AioContext used for fd handlers, timers, and BHs by this
   * BlockDriverState and all its children.
   *
   * This function must be called from the old #AioContext or with a lock held 
so
   * the old #AioContext is not executing.
   */
  void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context);

and:

  /* Remove fd handlers, timers, and other event loop callbacks so the event
   * loop is no longer in use.  Called with no in-flight requests and in
   * depth-first traversal order with parents before child nodes.
   */
  void (*bdrv_detach_aio_context)(BlockDriverState *bs);

  /* Add fd handlers, timers, and other event loop callbacks so I/O requests
   * can be processed again.  Called with no in-flight requests and in
   * depth-first traversal order with child nodes before parent nodes.
   */
  void (*bdrv_attach_aio_context)(BlockDriverState *bs,
                                  AioContext *new_context);

These rules ensure that it's safe to perform these operations.  You
don't have to support arbitrary callers in NBD either.

> But because of my inexperience with coroutines, I'm not sure. I now have
> these patches nearly unchanged here for about a week and I'm looking for
> ways of testing them, but so far I could only test whether the old use
> cases work, but not whether they will work for what they are intended to
> do: With BDS changing their AioContext.
> 
> So, because I'm not sure what else to do and because I don't know how to
> test multiple AIO threads (how do I move a BDS into another iothread?)
> I'm just sending this out as an RFC.

Use a Linux guest with virtio-blk:

  qemu -drive if=none,file=test.img,id=drive0 \
       -object iothread,id=iothread0 \
       -device virtio-blk-pci,drive=drive0,x-iothread=iothread0 \
       ...

Once the guest has booted the virtio-blk device will be in dataplane
mode.  That means drive0's BlockDriverState ->aio_context will be the
IOThread AioContext and not the global qemu_aio_context.

Now you can exercise the run-time NBD server over QMP and check that
things still work.  For example, try running a few instance of dd
if=/dev/vdb of=/dev/null iflag=direct inside the guest to stress guest
I/O.

Typically what happens if code is not dataplane-aware is that a deadlock
or crash occurs due to race conditions between the QEMU main loop and
the IOThread for this virtio-blk device.

For an overview of dataplane programming concepts, see:
https://lists.gnu.org/archive/html/qemu-devel/2014-05/msg01436.html

Stefan

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [RFC 0/5] nbd: Adapt for dataplane, Stefan Hajnoczi <=
- Re: [Qemu-devel] [RFC 0/5] nbd: Adapt for dataplane, Max Reitz, 2014/06/05
- Re: [Qemu-devel] [RFC 0/5] nbd: Adapt for dataplane, Stefan Hajnoczi, 2014/06/04
- Re: [Qemu-devel] [RFC 0/5] nbd: Adapt for dataplane, Stefan Hajnoczi, 2014/06/04

Prev by Date: Re: [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors
Next by Date: Re: [Qemu-devel] [PATCH v2 6/6] arm/highbank: enable PSCI emulation support
Previous by thread: [Qemu-devel] [PATCH] block: asynchronously stop the VM on I/O errors
Next by thread: Re: [Qemu-devel] [RFC 0/5] nbd: Adapt for dataplane
Index(es):
- Date
- Thread