qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v1] util/aio: Keep notification disabled as much as possible


From: Stefan Hajnoczi
Subject: Re: [RFC v1] util/aio: Keep notification disabled as much as possible
Date: Thu, 7 Jul 2022 10:04:23 +0100

On Wed, Jul 06, 2022 at 10:12:14PM +0800, Chao Gao wrote:
> On Wed, Jul 06, 2022 at 12:59:29PM +0100, Stefan Hajnoczi wrote:
> >On Fri, Jul 01, 2022 at 05:13:48PM +0800, Chao Gao wrote:
> >> When measuring FIO read performance (cache=writethrough, bs=4k, 
> >> iodepth=64) in
> >> VMs, we observe ~80K/s notifications (e.g., EPT_MISCONFIG) from guest to 
> >> qemu.
> >
> >It's not clear to me what caused the frequent poll_set_started(ctx,
> >false) calls and whether this patch is correct. I have posted some
> 
> Me either. That's why the patch was marked RFC.

Your explanation about worker threads calling qemu_bh_schedule() ->
aio_notify() makes sense to me. Polling mode is stopped prematurely when
the event loop goes to process the BH. There is no need to stop polling
mode, the next event loop iteration could resume polling mode.

> >questions below about the nature of this issue.
> >
> >If ctx->fdmon_ops->wait() is called while polling is still enabled then
> >hangs or unnecessary latency can occur. For example, consider an fd
> >handler that temporarily suppresses fd activity between poll start/end.
> >The thread would be blocked in ->wait() and the fd will never become
> >readable. Even if a hang doesn't occur because there is a timeout, there
> >would be extra latency because the fd doesn't become readable and we
> >have to wait for the timeout to expire so we can poll again. So we must
> >be sure it's safe to leave polling enabled across the event loop and I'm
> >not sure if this patch is okay.
> 
> Thanks for the explanation.
> 
> in aio_poll(),
> 
>     if (timeout || ctx->fdmon_ops->need_wait(ctx)) {
>         ctx->fdmon_ops->wait(ctx, &ready_list, timeout);
>     }
> 
> if @timeout is zero, then ctx->fdmon_ops->wait() won't be called.
> 
> In case #2 and #3 below, it is guaranteed that @timeout is zero after
> try_poll_mode() return. So, I think it is safe to keep polling enabled
> for these two cases.

I think you're right, it's safe since timeout is zeroed.

> 
> >
> >> 
> >> Currently, poll_set_started(ctx,false) is called in try_poll_mode() to 
> >> enable
> >> virtqueue notification in below 4 cases:
> >> 
> >> 1. ctx->poll_ns is 0
> >> 2. a zero timeout is passed to try_poll_mode()
> >> 3. polling succeeded but reported as no progress
> >> 4. polling failed and reported as no progress
> >> 
> >> To minimize unnecessary guest notifications, keep notification disabled 
> >> when
> >> it is possible, i.e., polling is enabled and last polling doesn't fail.
> >
> >What is the exact definition of polling success/failure?
> 
> Polling success: found some events pending.
> Polling failure: timeout.
> 
> success/failure are used because I saw below comment in
> run_poll_handlers_once(), then I thought they are well-known terms.
> 
>             /*
>              * Polling was successful, exit try_poll_mode immediately
>              * to adjust the next polling time.
>              */
> 
> >
> >> 
> >> Keep notification disabled for case #2 and #3; handle case #2 simply by a 
> >> call
> >
> >Did you see case #2 happening often? What was the cause?
> 
> I think so. I can add some tracepoint and collect statistics.
> 
> IMO (of course, I can be totally wrong), the cause is:
> when a worker thread in thread poll completes an IO request, a bottom
> half is queued by work_thread()->qemu_bh_schedule(). Pending bottom
> halves lead to aio_compute_timeout() setting timeout to zero and then
> a zero timeout is passed to try_poll_mode().
> 
> >
> >> of run_poll_handlers_once() and for case #3, differentiate 
> >> successful/failed
> >> polling and skip the call of poll_set_started(ctx,false) for successful 
> >> ones.
> >
> >This is probably the most interesting case. When polling detects an
> >event, that's considered "progress", except for aio_notify() events.
> >aio_notify() is an internal event for restarting the event loop when
> >something has changed (e.g. fd handlers have been added/removed). I
> >wouldn't expect it to intefere polling frequently since that requires
> >another QEMU thread doing something to the AioContext, which should be
> >rare.
> >
> >Was aio_notify() intefering with polling in your case? Do you know why?
> 
> Yes. It was. The reason is the same: after finishing IO requests, worker
> threads queue bottom halves during which aio_notify() is called.

Does this patch solve the issue? The idea is to defer
poll_set_started(false) for as long as possible.

diff --git a/util/aio-posix.c b/util/aio-posix.c
index 731f3826c0..536f8b2534 100644
--- a/util/aio-posix.c
+++ b/util/aio-posix.c
@@ -591,12 +591,6 @@ static bool try_poll_mode(AioContext *ctx, AioHandlerList 
*ready_list,
             return true;
         }
     }
-
-    if (poll_set_started(ctx, ready_list, false)) {
-        *timeout = 0;
-        return true;
-    }
-
     return false;
 }

@@ -657,6 +651,11 @@ bool aio_poll(AioContext *ctx, bool blocking)
      * system call---a single round of run_poll_handlers_once suffices.
      */
     if (timeout || ctx->fdmon_ops->need_wait(ctx)) {
+        if (poll_set_started(ctx, &ready_list, false)) {
+            timeout = 0;
+            progress = true;
+        }
+
         ctx->fdmon_ops->wait(ctx, &ready_list, timeout);
     }


Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]