qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] monitor: flush messages on abort


From: Markus Armbruster
Subject: Re: [PATCH] monitor: flush messages on abort
Date: Thu, 16 Nov 2023 16:14:07 +0100
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/28.2 (gnu/linux)

Steven Sistare <steven.sistare@oracle.com> writes:

> On 11/15/2023 11:15 AM, Markus Armbruster wrote:
>> Steven Sistare <steven.sistare@oracle.com> writes:
>>> On 11/15/2023 3:41 AM, Markus Armbruster wrote:
>>>> Daniel P. Berrangé <berrange@redhat.com> writes:
>>>>
>>>>> On Fri, Nov 03, 2023 at 03:51:00PM -0400, Steven Sistare wrote:
>>>>>> On 11/3/2023 1:33 PM, Daniel P. Berrangé wrote:
>>>>>>> On Fri, Nov 03, 2023 at 09:01:29AM -0700, Steve Sistare wrote:
>>>>>>>> Buffered monitor output is lost when abort() is called.  The pattern
>>>>>>>> error_report() followed by abort() occurs about 60 times, so valuable
>>>>>>>> information is being lost when the abort is called in the context of a
>>>>>>>> monitor command.
>>>>>>>
>>>>>>> I'm curious, was there a particular abort() scenario that you hit ?
>>>>>>
>>>>>> Yes, while tweaking the suspended state, and forgetting to add 
>>>>>> transitions:
>>>>>>
>>>>>>         error_report("invalid runstate transition: '%s' -> '%s'",
>>>>>>         abort();
>>>>>>
>>>>>> But I have previously hit this for other errors.
>>>>
>>>> Can you provide a reproducer?
>>>
>>> I sometimes hit this when developing new code.  I do not have a reproducer 
>>> for upstream
>>> branches. The patch is aimed at helping developers, not users.
>> 
>> I'm asking because I can't see how the error message could be lost.  A
>> reproducer would let me find out.  "Apply this set of broken patches,
>> then do that" would serve.
>
> $ patch -p1 << EOF
> diff --git a/monitor/qmp-cmds.c b/monitor/qmp-cmds.c
> index b0f948d..c9a3aee 100644
> --- a/monitor/qmp-cmds.c
> +++ b/monitor/qmp-cmds.c
> @@ -47,8 +47,12 @@ void qmp_quit(Error **errp)
>      qemu_system_shutdown_request(SHUTDOWN_CAUSE_HOST_QMP_QUIT);
>  }
>
> +#include "qemu/error-report.h"
> +
>  void qmp_stop(Error **errp)
>  {
> +    error_report("injected failure");
> +    abort();
>      /* if there is a dump in background, we should wait until the dump
>       * finished */
>      if (qemu_system_dump_in_progress()) {
> EOF
>
> # This example loses the error message:
>
> $ args='-display none -chardev 
> socket,id=mon1,server=on,path=mon1.sock,wait=off -mon mon1,mode=control'
> $ qemu-system-x86_64 $args < /dev/null &
> [1] 18048
> $ echo '{"execute":"qmp_capabilities"} 
> {"execute":"human-monitor-command","arguments":{"command-line":"stop"}}' | 
> ncat -U mon1.sock
> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 1, "major": 8}, 
> "package": "v8.1.0-2976-g4025fde-dirty"}, "capabilities": ["oob"]}}
> {"return": {}}
> Ncat: Connection reset by peer.
> $
> [1]+  Aborted                 qemu-system-x86_64 $args < /dev/null
>
>
> # This example preserves the error message. I include it to show the 
> ncat-based test is valid.
>
> $ qemu-system-x86_64 $args < /dev/null &
> [1] 18060
> $ echo '{"execute":"qmp_capabilities"} {"execute":"stop"}' | ncat -U mon1.sock
> {"QMP": {"version": {"qemu": {"micro": 50, "minor": 1, "major": 8}, 
> "package": "v8.1.0-2976-g4025fde-dirty"}, "capabilities": ["oob"]}}
> {"return": {}}
> injected failure                <============= qemu stderr
> Ncat: Connection reset by peer.
> $
> [1]+  Aborted                 qemu-system-x86_64 $args < /dev/null
>
> - Steve

Reproduced, thanks!

Confirms my reading of the code.  Two cases:

1. Normal monitor output

   Flushed on newline.  A partial last line can be lost on crash or
   exit().

   Works as intended.

2. Output being captured for QMP command human-monitor-command

   Never flushed.  Instead, the entire buffer is included in the
   command's success response.  Naturally there is no response on crash.

   Works as intended.

Here's how to fish unflushed output out of a core dump:

    (gdb) bt
    #0  0x00007efeba52fecc in __pthread_kill_implementation () at 
/lib64/libc.so.6
    #1  0x00007efeba4dfab6 in raise () at /lib64/libc.so.6
    #2  0x00007efeba4c97fc in abort () at /lib64/libc.so.6
    #3  0x000055588dcc848c in qmp_stop (errp=0x0) at ../monitor/qmp-cmds.c:53
    #4  0x000055588dcc36ab in hmp_stop (mon=0x7ffc01d47cc0, 
qdict=0x555891d19200)
        at ../monitor/hmp-cmds.c:119
    #5  0x000055588dcc714e in handle_hmp_command_exec
        (mon=0x7ffc01d47cc0, cmd=0x55588f0689d0 <hmp_cmds+7760>, 
qdict=0x555891d19200) at ../monitor/hmp.c:1106
    #6  0x000055588dcc737b in handle_hmp_command
        (mon=0x7ffc01d47cc0, cmdline=0x7efea80036f4 "") at ../monitor/hmp.c:1158
    #7  0x000055588dcc8922 in qmp_human_monitor_command
        (command_line=0x7efea80036f0 "stop", has_cpu_index=false, cpu_index=0, 
errp=0x7ffc01d47dd0) at ../monitor/qmp-cmds.c:184

    [...]

    (gdb) up 4
    #4  0x000055588dcc36ab in hmp_stop (mon=0x7ffc01d47cc0, 
qdict=0x555891d19200)
        at ../monitor/hmp-cmds.c:119
    119     qmp_stop(NULL);
    (gdb) p mon->outbuf
    $1 = (GString *) 0x555890a65260
    (gdb) p *$
    $2 = {str = 0x7efea8002c00 "injected failure\r\n", len = 18, 
      allocated_len = 128}




reply via email to

[Prev in Thread] Current Thread [Next in Thread]