[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2] gdbstub: Switch to the thread receiving a signal
From: |
Alex Bennée |
Subject: |
Re: [PATCH v2] gdbstub: Switch to the thread receiving a signal |
Date: |
Wed, 20 Oct 2021 18:57:22 +0100 |
User-agent: |
mu4e 1.7.0; emacs 28.0.60 |
Pavel Labath <pavel@labath.sk> writes:
> On 20/10/2021 10:35, Alex Bennée wrote:
>> Pavel Labath <pavel@labath.sk> writes:
>>
>>> Respond with Txxthread:yyyy; instead of a plain Sxx to indicate which
>>> thread received the signal. Otherwise, the debugger will associate it
>>> with the main one. Also automatically select this thread, as that is
>>> what gdb expects.
>> Just for reference it's best to post vN's in a new thread as the
>> Replied-to field can confuse some of the automatic tools (b4, patchew
>> etc).
>
> Got it.
>>
>>> Signed-off-by: Pavel Labath <pavel@labath.sk>
>>> ---
>>> gdbstub.c | 8 ++-
>>> tests/tcg/multiarch/Makefile.target | 10 +++-
>>> .../gdbstub/test-thread-breakpoint.py | 60 +++++++++++++++++++
>>> 3 files changed, 75 insertions(+), 3 deletions(-)
>>> create mode 100644 tests/tcg/multiarch/gdbstub/test-thread-breakpoint.py
>>>
>>> diff --git a/gdbstub.c b/gdbstub.c
>>> index 36b85aa..23baaef 100644
>>> --- a/gdbstub.c
>>> +++ b/gdbstub.c
>>> @@ -3138,8 +3138,12 @@ gdb_handlesig(CPUState *cpu, int sig)
>>> tb_flush(cpu);
>>> if (sig != 0) {
>>> - snprintf(buf, sizeof(buf), "S%02x", target_signal_to_gdb(sig));
>>> - put_packet(buf);
>>> + gdb_set_stop_cpu(cpu);
>>> + g_string_printf(gdbserver_state.str_buf,
>>> + "T%02xthread:", target_signal_to_gdb(sig));
>>> + gdb_append_thread_id(cpu, gdbserver_state.str_buf);
>>> + g_string_append_c(gdbserver_state.str_buf, ';');
>>> + put_strbuf();
>>> }
>>> /* put_packet() might have detected that the peer terminated the
>>> connection. */
>>> diff --git a/tests/tcg/multiarch/Makefile.target
>>> b/tests/tcg/multiarch/Makefile.target
>>> index 6ccb592..c84683f 100644
>>> --- a/tests/tcg/multiarch/Makefile.target
>>> +++ b/tests/tcg/multiarch/Makefile.target
>>> @@ -70,11 +70,19 @@ run-gdbstub-qxfer-auxv-read: sha1
>>> --bin $< --test
>>> $(MULTIARCH_SRC)/gdbstub/test-qxfer-auxv-read.py, \
>>> "basic gdbstub qXfer:auxv:read support")
>>> +run-gdbstub-thread-breakpoint: testthread
>>> + $(call run-test, $@, $(GDB_SCRIPT) \
>>> + --gdb $(HAVE_GDB_BIN) \
>>> + --qemu $(QEMU) --qargs "$(QEMU_OPTS)" \
>>> + --bin $< --test
>>> $(MULTIARCH_SRC)/gdbstub/test-thread-breakpoint.py, \
>>> + "hitting a breakpoint on non-main thread")
>>> +
>>> else
>>> run-gdbstub-%:
>>> $(call skip-test, "gdbstub test $*", "need working gdb")
>>> endif
>>> -EXTRA_RUNS += run-gdbstub-sha1 run-gdbstub-qxfer-auxv-read
>>> +EXTRA_RUNS += run-gdbstub-sha1 run-gdbstub-qxfer-auxv-read \
>>> + run-gdbstub-thread-breakpoint
>>> # ARM Compatible Semi Hosting Tests
>>> #
>>> diff --git a/tests/tcg/multiarch/gdbstub/test-thread-breakpoint.py
>>> b/tests/tcg/multiarch/gdbstub/test-thread-breakpoint.py
>>> new file mode 100644
>>> index 0000000..798d508
>>> --- /dev/null
>>> +++ b/tests/tcg/multiarch/gdbstub/test-thread-breakpoint.py
>>> @@ -0,0 +1,60 @@
>>> +from __future__ import print_function
>>> +#
>>> +# Test auxiliary vector is loaded via gdbstub
>> I'm fairly sure this isn't what the test is doing...
> Oops. I'll fix this in the next version.
>
>>
>>> +#
>>> +# This is launched via tests/guest-debug/run-test.py
>>> +#
>>> +
>>> +import gdb
>>> +import sys
>>> +
>>> +failcount = 0
>>> +
>>> +def report(cond, msg):
>>> + "Report success/fail of test"
>>> + if cond:
>>> + print ("PASS: %s" % (msg))
>>> + else:
>>> + print ("FAIL: %s" % (msg))
>>> + global failcount
>>> + failcount += 1
>>> +
>>> +def run_test():
>>> + "Run through the tests one by one"
>>> +
>>> + sym, ok = gdb.lookup_symbol("thread1_func")
>>> + gdb.execute("b thread1_func")
>>> + gdb.execute("c")
>>> +
>>> + frame = gdb.selected_frame()
>>> + report(str(frame.function()) == "thread1_func", "break @
>>> %s"%frame)
>> I think we can do better here by checking gdb.selected_thread() and
>> ensuring the num (or global_num) is not 1. Also maybe check the
>> is_stopped() status:
>
> Checking `num` is a good idea. Checking is_stopped() doesn't hurt
> either, though I believe that (in all-stop mode) gdb just hardwires
> this to True for all threads, even those that are not actually stopped
> (more on that below).
>
> However, if that's ok with you, I think it'd still be nice to keep the
> frame check as well.
That's fine.
>> https://sourceware.org/gdb/current/onlinedocs/gdb/Threads-In-Python.html
>> I noticed while running the test that output still continued for
>> some
>> time from the other thread but it was still doing that pre this change
>> so I'm not quite sure what was going on there.
>>
>>> +
>>> +#
>>> +# This runs as the script it sourced (via -x, via run-test.py)
>>> +#
>>> +try:
>>> + inferior = gdb.selected_inferior()
>>> + arch = inferior.architecture()
>>> + print("ATTACHED: %s" % arch.name())
>>> +except (gdb.error, AttributeError):
>>> + print("SKIPPING (not connected)", file=sys.stderr)
>>> + exit(0)
>>> +
>>> +if gdb.parse_and_eval('$pc') == 0:
>>> + print("SKIP: PC not set")
>>> + exit(0)
>>> +
>>> +try:
>>> + # These are not very useful in scripts
>>> + gdb.execute("set pagination off")
>>> + gdb.execute("set confirm off")
>>> +
>>> + # Run the actual tests
>>> + run_test()
>>> +except (gdb.error):
>>> + print ("GDB Exception: %s" % (sys.exc_info()[0]))
>>> + failcount += 1
>>> + pass
>>> +
>>> +print("All tests complete: %d failures" % failcount)
>>> +exit(failcount)
>> I also tried some manual testing:
>> ➜ ./qemu-aarch64 -g 1234 tests/tcg/aarch64-linux-user/testthread
>> fish: “./qemu-aarch64 -g 1234 tests/tc…” terminated by signal SIGSEGV
>> (Address boundary error)
>> 🕙12:33:49 alex@zen:qemu.git/builds/arm.all on gdbstub/next [$!?⇡] took
>> 12s [⚡ SEGV]
>> ✗
>> where in the other window I did:
>> 0x00000000004005d0 in _start ()
>> (gdb) hbreak thread2_func
>> Hardware assisted breakpoint 1 at 0x400824: file
>> /home/alex/lsrc/qemu.git/tests/tcg/multiarch/testthread.c, line 34.
>> (gdb) hbreak thread1_func
>> Hardware assisted breakpoint 2 at 0x400798: file
>> /home/alex/lsrc/qemu.git/tests/tcg/multiarch/testthread.c, line 22.
>> (gdb) c
>> Continuing.
>> [New Thread 1.2748248]
>> Remote connection closed
>> which seems to indicate some problems with breaking on multiple
>> threads.
>> Maybe this is related to the weird output I was seeing above?
>>
>
> Yes, that's definitely related. What's happening is that the qemu does
> not stop other thread when one of them hits a breakpoint (or stops for
> any other reason) -- as far as I can tell it does not have any code
> which would even attempt to do that. This is why you're seeing the
> output even after the process is purportedly stopped.
>
> Things get even more interesting when you have two threads hitting a
> breakpoint simultaneously. At that point both of them will enter their
> gdb stubs and attempt to talk to gdb at the same time. As you can
> imagine, this cannot end well, and eventually the connection will
> become so messed up that one side just gives up and terminates the
> link.
>
> I am aware of this issue, and I (well, Stan (cc'ed) is, for the most
> part) looking for a way to fix it. If you have any ideas, we'd very
> much like to hear them. The way I see it, we need to implement some
> kind of a "stop the world" mechanism, to stop/interrupt all threads
> whenever the gdb stub becomes active (and make sure it can handle
> simultaneous debug events).
vm_stop(RUN_STATE_PAUSED) should do the trick. We do it elsewhere in
the gdbstub.
> However, I am don't know enough about qemu
> internals to tell how to actually go about doing that.
>
> My plan was to "get my feet wet" with a simple patch that improves the
> situation for the case when there are no simultaneous debug events,
> and eventually hopefully figure out a way how to address the bigger
> problem.
Great. Once you've made the changes I think the patch is ready to go in
and the larger questions can be dealt with later.
>
> regards,
> Pavel
--
Alex Bennée