qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC v1] util/aio: Keep notification disabled as much as possible


From: Stefan Hajnoczi
Subject: Re: [RFC v1] util/aio: Keep notification disabled as much as possible
Date: Wed, 6 Jul 2022 12:59:29 +0100

On Fri, Jul 01, 2022 at 05:13:48PM +0800, Chao Gao wrote:
> When measuring FIO read performance (cache=writethrough, bs=4k, iodepth=64) in
> VMs, we observe ~80K/s notifications (e.g., EPT_MISCONFIG) from guest to qemu.

It's not clear to me what caused the frequent poll_set_started(ctx,
false) calls and whether this patch is correct. I have posted some
questions below about the nature of this issue.

If ctx->fdmon_ops->wait() is called while polling is still enabled then
hangs or unnecessary latency can occur. For example, consider an fd
handler that temporarily suppresses fd activity between poll start/end.
The thread would be blocked in ->wait() and the fd will never become
readable. Even if a hang doesn't occur because there is a timeout, there
would be extra latency because the fd doesn't become readable and we
have to wait for the timeout to expire so we can poll again. So we must
be sure it's safe to leave polling enabled across the event loop and I'm
not sure if this patch is okay.

> 
> Currently, poll_set_started(ctx,false) is called in try_poll_mode() to enable
> virtqueue notification in below 4 cases:
> 
> 1. ctx->poll_ns is 0
> 2. a zero timeout is passed to try_poll_mode()
> 3. polling succeeded but reported as no progress
> 4. polling failed and reported as no progress
> 
> To minimize unnecessary guest notifications, keep notification disabled when
> it is possible, i.e., polling is enabled and last polling doesn't fail.

What is the exact definition of polling success/failure?

> 
> Keep notification disabled for case #2 and #3; handle case #2 simply by a call

Did you see case #2 happening often? What was the cause?

> of run_poll_handlers_once() and for case #3, differentiate successful/failed
> polling and skip the call of poll_set_started(ctx,false) for successful ones.

This is probably the most interesting case. When polling detects an
event, that's considered "progress", except for aio_notify() events.
aio_notify() is an internal event for restarting the event loop when
something has changed (e.g. fd handlers have been added/removed). I
wouldn't expect it to intefere polling frequently since that requires
another QEMU thread doing something to the AioContext, which should be
rare.

Was aio_notify() intefering with polling in your case? Do you know why?

> 
> With this patch applied, FIO seq-read performance (bs=4k, iodepth=64,
> cache=writethrough) in VMs increases from 330K/s to 413K/s IOPS.
> 
> Below are changes in VM-exit statistics:
> w/o this patch (duration:10s):
>            VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time     
>     Avg time
> 
>      EPT_MISCONFIG     797440    99.34%    98.58%      0.39us     57.92us     
>  0.60us ( +-   0.05% )
>          MSR_WRITE       3672     0.46%     1.15%      0.88us      4.97us     
>  1.52us ( +-   0.53% )
> EXTERNAL_INTERRUPT       1638     0.20%     0.27%      0.59us     11.04us     
>  0.81us ( +-   1.71% )
> 
> w/ this patch (duration:10s):
>           VM-EXIT    Samples  Samples%     Time%    Min Time    Max Time      
>    Avg time
> 
>          MSR_WRITE       3524    84.11%    87.17%      0.86us      4.43us     
>  1.74us ( +-   0.60% )
> EXTERNAL_INTERRUPT        515    12.29%    10.05%      0.64us      8.95us     
>  1.37us ( +-   1.54% )
>      EPT_MISCONFIG        151     3.60%     2.79%      0.40us     52.07us     
>  1.30us ( +-  31.44% )
> 
> Signed-off-by: Chao Gao <chao.gao@intel.com>
> ---
>  util/aio-posix.c | 16 ++++++++++++----
>  1 file changed, 12 insertions(+), 4 deletions(-)
> 
> diff --git a/util/aio-posix.c b/util/aio-posix.c
> index 731f3826c0..bd2076145b 100644
> --- a/util/aio-posix.c
> +++ b/util/aio-posix.c
> @@ -519,7 +519,7 @@ static bool remove_idle_poll_handlers(AioContext *ctx,
>   * Returns: true if progress was made, false otherwise
>   */
>  static bool run_poll_handlers(AioContext *ctx, AioHandlerList *ready_list,
> -                              int64_t max_ns, int64_t *timeout)
> +                              int64_t max_ns, int64_t *timeout, bool 
> *success)
>  {
>      bool progress;
>      int64_t start_time, elapsed_time;
> @@ -553,6 +553,8 @@ static bool run_poll_handlers(AioContext *ctx, 
> AioHandlerList *ready_list,
>          progress = true;
>      }
>  
> +    *success = !*timeout;
> +
>      /* If time has passed with no successful polling, adjust *timeout to
>       * keep the same ending time.
>       */
> @@ -577,22 +579,28 @@ static bool run_poll_handlers(AioContext *ctx, 
> AioHandlerList *ready_list,
>  static bool try_poll_mode(AioContext *ctx, AioHandlerList *ready_list,
>                            int64_t *timeout)
>  {
> -    int64_t max_ns;
> +    int64_t max_ns, start_time;
> +    bool success = false;
>  
>      if (QLIST_EMPTY_RCU(&ctx->poll_aio_handlers)) {
>          return false;
>      }
>  
> +    if (!*timeout) {
> +        start_time = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
> +        return run_poll_handlers_once(ctx, ready_list, start_time, timeout);
> +    }
> +
>      max_ns = qemu_soonest_timeout(*timeout, ctx->poll_ns);
>      if (max_ns && !ctx->fdmon_ops->need_wait(ctx)) {
>          poll_set_started(ctx, ready_list, true);
>  
> -        if (run_poll_handlers(ctx, ready_list, max_ns, timeout)) {
> +        if (run_poll_handlers(ctx, ready_list, max_ns, timeout, &success)) {
>              return true;
>          }
>      }
>  
> -    if (poll_set_started(ctx, ready_list, false)) {
> +    if (!success && poll_set_started(ctx, ready_list, false)) {
>          *timeout = 0;
>          return true;
>      }
> -- 
> 2.25.1
> 

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]