qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [patch v0] qapi/qmp: Add timestamps to qmp command responses.


From: Denis Plotnikov
Subject: Re: [patch v0] qapi/qmp: Add timestamps to qmp command responses.
Date: Tue, 27 Sep 2022 14:59:20 +0300
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0


On 27.09.2022 09:04, Markus Armbruster wrote:
Daniel P. Berrangé <berrange@redhat.com> writes:

On Mon, Sep 26, 2022 at 12:59:40PM +0300, Denis Plotnikov wrote:
Add "start" & "end" timestamps to qmp command responses.
It's disabled by default, but can be enabled with 'timestamp=on'
monitor's parameter, e.g.:
     -chardev  socket,id=mon1,path=/tmp/qmp.socket,server=on,wait=off
     -mon chardev=mon1,mode=control,timestamp=on
I'm not convinced a cmdline flag is the right approach here.

I think it ought be something defined by the QMP spec.
The QMP spec is docs/interop/qmp-spec.txt.  The feature needs to be
defined there regardless of how we control it.
ok, thanks for pointing out

The "QMP" greeting should report "timestamp" capabilities.

The 'qmp_capabilities' command can be used to turn on this
capability for all commands henceforth.
Yes, this is how optional QMP protocol features should be controlled.

Bonus: control is per connection, not just globally.

As an option extra, the 'execute' command could gain a
parameter to allow this to be requested for only an
individual command.
Needs a use case.

Alternatively we could say the overhead of adding the timestmaps
is small enough that we just add this unconditionally for
everything hence, with no opt-in/opt-out.
Yes, because the extension is backwards compatible.

May be it worth to send the timestamps always in the response if doesn't contradicts with anything and doesn't bring any unnecessary data overhead.

From the other hand turning it on via qmp capabilities seems to be more flexible solution.


Aside: qmp-spec.txt could be clearer on what that means.

Example of result:

     ./qemu/scripts/qmp/qmp-shell /tmp/qmp.socket

     (QEMU) query-status
     {"end": {"seconds": 1650367305, "microseconds": 831032},
      "start": {"seconds": 1650367305, "microseconds": 831012},
      "return": {"status": "running", "singlestep": false, "running": true}}

The responce of the qmp command contains the start & end time of
the qmp command processing.
Seconds and microseconds since when?  The update to qmp-spec.txt should
tell.

Why split the time into seconds and microseconds?  If you use
microseconds since the Unix epoch (1970-01-01 UTC), 64 bit unsigned will
result in a year 586524 problem:

     $ date --date "@`echo '2^64/1000000' | bc`"
     Wed Jan 19 09:01:49 CET 586524

Even a mere 53 bits will last until 2255.
This is Just for convenience, may be it's too much and timestamp in msec if enough

These times may be helpful for the management layer in understanding of
the actual timeline of a qmp command processing.
Can you explain the problem scenario in more detail.
Yes, please, because:

The mgmt app already knows when it send the QMP command and knows
when it gets the QMP reply.  This covers the time the QMP was
queued before processing (might be large if QMP is blocked on
another slow command) , the processing time, and the time any
reply was queued before sending (ought to be small).

So IIUC, the value these fields add is that they let the mgmt
app extract only the command processing time, eliminating
any variance do to queue before/after.
So the scenario is the following: we need a means to understand from the management layer prospecitive of what is the timeline of the command execution. This is needed for a problem resolving if a qmp command executes for too long from the management layer point of view. Specifically, management layer sees the execution time as "management_layer_internal_routine_time" + "qemu_dispatching_time" + "qemu_qmp_command_execution_time". Suggested qmp command timestaps gives "qemu_command_execution_time". Management layer calculates "management_layer_internal_routine_time" internally. Using those two things we can calculate "qemu_dispatching_time" and decide where the potential delays comes from. This will gives us a direction of further problem investigation.

Suggested-by: Andrey Ryabinin <arbn@yandex-team.ru>
Signed-off-by: Denis Plotnikov <den-plotnikov@yandex-team.ru>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]