Re: [RFC 0/3] qmp: make qmp_device

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 0/3] qmp: make qmp_device_add() a coroutine

From:	Paolo Bonzini
Subject:	Re: [RFC 0/3] qmp: make qmp_device_add() a coroutine
Date:	Thu, 7 Sep 2023 13:28:55 +0200
User-agent:	Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0

On 9/6/23 21:01, Stefan Hajnoczi wrote:

It is not safe to call drain_call_rcu() from qmp_device_add() because
some call stacks are not prepared for drain_call_rcu() to drop the Big
QEMU Lock (BQL).

For example, device emulation code is protected by the BQL but when it
calls aio_poll() -> ... -> qmp_device_add() -> drain_call_rcu() then the
BQL is dropped. See https://bugzilla.redhat.com/show_bug.cgi?id=2215192 for a
concrete bug of this type.

Another limitation of drain_call_rcu() is that it cannot be invoked within an
RCU read-side critical section since the reclamation phase cannot complete
until the end of the critical section. Unfortunately, call stacks have been
seen where this happens (see
https://bugzilla.redhat.com/show_bug.cgi?id=2214985).

I think the root cause here is that do_qmp_dispatch_bh is called on thewrong context, namely qemu_get_aio_context() instead ofiohandler_get_aio_context(). This is what causes it to move to the vCPUthread.

Auditing all subsystems that use iohandler_get_aio_context(), forexample via qemu_set_fd_handler(), together with bottom halves, would bea bit daunting.

I don't have any objection to this patch series actually, but I wouldlike to see if using the right AioContext also fixes the bug---and thentreat these changes as more of a cleanup. Coroutines are prettypervasive in QEMU and are not going away which, as you say in theupdated docs, makes drain_call_rcu_co() preferrable to drain_call_rcu().


Paolo

This patch series introduces drain_call_rcu_co(), which does the same thing as
drain_call_rcu() but asynchronously. By yielding back to the event loop we can
wait until the caller drops the BQL and leaves its RCU read-side critical
section.

Patch 1 changes HMP so that coroutine monitor commands yield back to the event
loop instead of running inside a nested event loop.

Patch 2 introduces the new drain_call_rcu_co() API.

Patch 3 converts qmp_device_add() into a coroutine monitor command and uses
drain_call_rcu_co().

I'm sending this as an RFC because I don't have confirmation yet that the bugs
mentioned above are fixed by this patch series.

Stefan Hajnoczi (3):
   hmp: avoid the nested event loop in handle_hmp_command()
   rcu: add drain_call_rcu_co() API
   qmp: make qmp_device_add() a coroutine

  MAINTAINERS            |  2 ++
  docs/devel/rcu.txt     | 21 ++++++++++++++++
  qapi/qdev.json         |  1 +
  include/monitor/qdev.h |  3 ++-
  include/qemu/rcu.h     |  1 +
  util/rcu-internal.h    |  8 ++++++
  monitor/hmp.c          | 28 +++++++++++----------
  monitor/qmp-cmds.c     |  2 +-
  softmmu/qdev-monitor.c | 34 +++++++++++++++++++++++---
  util/rcu-co.c          | 55 ++++++++++++++++++++++++++++++++++++++++++
  util/rcu.c             |  3 ++-
  hmp-commands.hx        |  1 +
  util/meson.build       |  2 +-
  13 files changed, 140 insertions(+), 21 deletions(-)
  create mode 100644 util/rcu-internal.h
  create mode 100644 util/rcu-co.c

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [RFC 1/3] hmp: avoid the nested event loop in handle_hmp_command(), (continued)
- [RFC 3/3] qmp: make qmp_device_add() a coroutine, Stefan Hajnoczi, 2023/09/06
  - Re: [RFC 3/3] qmp: make qmp_device_add() a coroutine, Kevin Wolf, 2023/09/12
    - Re: [RFC 3/3] qmp: make qmp_device_add() a coroutine, Stefan Hajnoczi, 2023/09/12
- [RFC 2/3] rcu: add drain_call_rcu_co() API, Stefan Hajnoczi, 2023/09/06
  - Re: [RFC 2/3] rcu: add drain_call_rcu_co() API, Kevin Wolf, 2023/09/12
    - Re: [RFC 2/3] rcu: add drain_call_rcu_co() API, Stefan Hajnoczi, 2023/09/12
- Re: [RFC 0/3] qmp: make qmp_device_add() a coroutine, Paolo Bonzini <=
  - Re: [RFC 0/3] qmp: make qmp_device_add() a coroutine, Stefan Hajnoczi, 2023/09/07
    - Re: [RFC 0/3] qmp: make qmp_device_add() a coroutine, Paolo Bonzini, 2023/09/07
    - Re: [RFC 0/3] qmp: make qmp_device_add() a coroutine, Stefan Hajnoczi, 2023/09/07
    - Re: [RFC 0/3] qmp: make qmp_device_add() a coroutine, Kevin Wolf, 2023/09/12
    - Re: [RFC 0/3] qmp: make qmp_device_add() a coroutine, Paolo Bonzini, 2023/09/13

Prev by Date: Re: [PATCH 1/2] hw/cxl: Add utility functions decoder interleave ways and target count.
Next by Date: Re: [PATCH v2 3/3] gdbstub: replace exit(0) with proper shutdown
Previous by thread: Re: [RFC 2/3] rcu: add drain_call_rcu_co() API
Next by thread: Re: [RFC 0/3] qmp: make qmp_device_add() a coroutine
Index(es):
- Date
- Thread