[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Libunwind-devel] [RFC] _ULx86_64_tdep_trace returns off-by-one addr

From: Lassi Tuura
Subject: Re: [Libunwind-devel] [RFC] _ULx86_64_tdep_trace returns off-by-one addresses
Date: Sat, 3 Dec 2011 16:47:22 +0100


> As evidenced by this code in tests/Gtest-trace.c:
>  /* Allow one in difference in comparison, trace returns adjusted addresses. 
> */
>  if (labs((unw_word_t) addresses[1][i] - (unw_word_t) addresses[2][i]) > 1)
>    {
>      printf ("FAILURE: backtrace() and unw_backtrace() addresses differ at 
> %d: %p vs. %p\n",
>             i, addresses[1][n], addresses[2][n]);
>      ++num_errors;
>    }
> the fast trace adjusts return addresses by one for all frames, except the
> frame that actually triggered a signal (if any), and that is inconsistent
> with what the "slow trace" would produce.

As far as I understand, it's the slow trace that is actually reporting
"wrong" address (off by one), and we actually want the behaviour of the
fast trace.

Without the "- d->use_prev_instr", the stack trace will have the next
instruction, which in some cases is in the next/wrong function.

I don't know how to fix the slow trace interface, though I wouldn't say
I spent much time scratching my head over that. My understanding was that
the past consensus was the clients should use unw_is_signal_frame() and
deduct one from the address reported (by slow trace) if that returns 0.
But almost none of the tests do. I don't know how many clients do...

> Aside from the inconsistency, this gives wrong address for the
> __restore_rt frame:
> (gdb) bt
> #0  do_backtrace () at ../../tests/Gtest-trace.c:97
> #1  0x0000000000401d8a in sighandler (signal=15, siginfo=0x7fffffffd270, 
> context=0x7fffffffd140) at ../../tests/Gtest-trace.c:219
> #2  <signal handler called>
> #3  0x00007ffff7656d57 in kill () at ../sysdeps/unix/syscall-template.S:82
> #4  0x0000000000401e7a in main (argc=1, argv=0x7fffffffd768) at 
> ../../tests/Gtest-trace.c:243
> (gdb) set $a =  (char **)&addresses[1]
> (gdb) p/a *($a++)
> $10 = 0x400e0f <do_backtrace+363>
> (gdb) 
> $11 = 0x401d89 <sighandler+113>
> (gdb) 
> $12 = 0x7ffff7656aef      <<< should be:  0x7ffff7656af0 <__restore_rt>
> (gdb) 
> $13 = 0x7ffff7656d57 <kill+7>
> (gdb) 
> $14 = 0x401e79 <main+237>
> (gdb) 
> $15 = 0x7ffff7641c4c <__libc_start_main+252>
> (gdb) 
> $16 = 0x400be8 <_start+40>
> (gdb) 
> $17 = 0x0

Hum. I think the real problem here is perhaps use_prev_instr has the wrong
value for the __restore_rt frame? I'd have to dig around in the code a bit,
but maybe the fast trace is not treating it the same as the slow trace?
(Signal frame, that is.)

> I wonder what the justification for "off by one" behavior is, and what
> are the ramifications of not doing that (in the trace returned, not during
> the actual unwind):

There was a fair amount of discussion on this in the past. As far as I
understand, fast trace is actually reporting the 'correct' addresses.
There's a bit more background on this issue in:

See especially the follow-up comments on the last two. Clients should also
deduct one unless unw_is_signal_frame() returns true. The (slow unwind)
API itself doesn't do it. With fast trace you can't since you cannot no
longer call those functions, so the value is pre-computed into the result.
I am not sure it's actually possible to get fully correct result with the
normal (slow trace) API.

IIRC in my tests the decrementing is definitely needed, or you'll get some
percentage of irrational results where function A calling B gets reported
as unrelated function C calling B, because C happens to follow the last
instruction of A, which happens to get profiled (or at other FDE boundary,
and you care for some reason, which could happen with hot/cold splits).


reply via email to

[Prev in Thread] Current Thread [Next in Thread]