[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Libunwind-devel] Port fasttrace to x86

From: Lassi Tuura
Subject: Re: [Libunwind-devel] Port fasttrace to x86
Date: Fri, 11 Nov 2011 19:22:39 +0100

Hi Paul,

> Also, using e.g. Ubuntu 10.04 GCC (4.4.3-4ubuntu5), one needs to build
> with -fasynchronous-unwind-tables to get the unwind info generated at all.
> But the worst that would happen is that the fast trace will fail and fall
> back to the slow trace, right?

Uh oh. I forgot about that - on x86_64 async unwind tables are the default.
I am afraid the fast trace blindly trusts the tables in name of speed. At
least you didn't override ACCESS_MEM_FAST() for x86, so it's at less likely
to crash :-)

Does the x86 implementation work if you profile something math-heavy for
example - something system-built, without async unwind tables? If it works
without crashing, does it generate truncated stacks with last one or two
levels, just below signal frame, being junk?

BTW that shouldn't be dependent on fast vs. slow trace. If you don't have
async unwind tables, standard trace should have just as much trouble -
except heuristic / frame-based unwinding may help it. Ah, hmm, you'll want
to change x86/Gstep.c to record the frame walk results for heuristic steps.

You'll probably also need the PLT matching bits I added for x86_64 to get
the fast trace work without falling back to slow trace too often.

Also a reminder that at least for x86_64, async unwind tables will have
inaccuracies with GCC prior to 4.5.0.

> Hmm, that's not what appears to happen though: I get _ULx86_tdep_trace
> return 0, which does not trigger fall back to slow_trace.

Hm, no. If the unwind info is there but incorrect, it will just blindly
trust it - there's no way to know the difference. Without ACCESS_MEM_FAST()
you may get an error from dwarf_get() which should stop the unwind with an

Also the fast trace relies on unw_step() informing it about frames it won't
understand. If it doesn't find UNW_X86_FRAME_OTHER, it assumes it's ok to
look at it, and just stops unwinding if it ends up in bad addresses. This
is why you need to copy some bits to x86/Gstep.c.

These are needed because there are a few common but harmless places which
e.g. totally lack unwind info, say at crt0 or thread start level, and low
percentage level problems with inaccurate unwind info - and the fast trace
unfortunately gives up too often without those assumptions.

> Also, if I build with above GCC and CFLAGS = '-g -O0', then I get a crash
> during fast trace ;-(
> Given above problems, perhaps we should enable 32-bit fast trace under
> configure --enable-fast-trace-for-x86 or some such?

Possibly. Let's see what we conclude of the items above.

Thanks for the other fixes. For this one:

+  unw_word_t cache_size = 1u << cache->log_size;
+  unw_word_t slot = ((eip * 0x9e3779b97f4a7c16ULL) >> 43) & (cache_size-1);

I meant that there's no need to do this in 64-bit math for x86. Just use a
32-bit multiplier and shift by 11 -- %eip will have top 32 bits zero if
multiplied as a 64-bit value, so maybe the compiler optimised it anyway.
I am not entirely sure how dense that makes the hashes so you may need to
tweak the logic a bit.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]