[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2 |
Date: |
Mon, 30 Jun 2014 10:08:50 +0200 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Sat, Jun 28, 2014 at 05:58:58PM +0800, Ming Lei wrote:
> On Sat, Jun 28, 2014 at 5:51 AM, Paolo Bonzini <address@hidden> wrote:
> > Il 27/06/2014 20:01, Ming Lei ha scritto:
> >
> >> I just implemented plug&unplug based batching, and it is working now.
> >> But throughout still has no obvious improvement.
> >>
> >> Looks loading in IOthread is a bit low, so I am wondering if there is
> >> block point caused by Qemu QEMU block layer.
> >
> >
> > What does perf say? Also, you can try using the QEMU trace subsystem and
> > see where the latency goes.
>
> Follows some test result against 8589744aaf07b62 of
> upstream qemu, and the test is done on my 2core(4thread)
> laptop:
>
> 1, with my draft batch patches[1](only linux-aio supported now)
> - throughput: +16% compared qemu upstream
> - average time spent by handle_notify(): 310us
> - average time between two handle_notify(): 1591us
> (this time reflects latency of handling host_notifier)
16% is still a worthwhile improvement. I guess batching only benefits
aio=native since the threadpool ought to do better when it receives
requests as soon as possible.
Patch or an RFC would be welcome.
> 2, same tests on 2.0.0 release(use custom Linux AIO)
> - average time spent by handle_notify(): 68us
> - average time between calling two handle_notify(): 269us
> (this time reflects latency of handling host_notifier)
>
> From above tests, looks root cause is late handling notify, and
> qemu block layer becomes 4times slower than previous custom
> linux aio taken by dataplane.
Try:
$ perf record -e syscalls:* --tid <iothread-tid>
^C
$ perf script # shows the trace log
The difference between syscalls in QEMU 2.0 and qemu.git/master could
reveal the problem.
Using perf you can also trace ioeventfd signalling in the host kernel
and compare against the QEMU handle_notify entry/return. It may be
easiest to use the ftrace_marker tracing backing in QEMU so the trace is
unified with the host kernel trace (./configure
--enable-trace-backend=ftrace and see the ftrace section in QEMU
docs/tracing.txt).
This way you can see whether the ioeventfd signal -> handle_notify()
entry increased or something else is going on.
Stefan
pgpah_5weioIs.pgp
Description: PGP signature
Re: [Qemu-devel] [regression] dataplane: throughout -40% by commit 580b6b2aa2, Stefan Hajnoczi, 2014/06/27