[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Libunwind-devel] libunwind x86-64 optimisations?

From: Daniel Jacobowitz
Subject: Re: [Libunwind-devel] libunwind x86-64 optimisations?
Date: Mon, 6 Jul 2009 08:06:27 -0400
User-agent: Mutt/1.5.20 (2009-06-14)

On Mon, Jul 06, 2009 at 01:24:14PM +0200, Lassi Tuura wrote:
> #1) libunwind seems to be reliable but not 100% async signal safe. In
> particular if called from signal handler (SIGPROF) at an inopportune
> time it may dead-lock. Specifically, if we get a profiling signal
> exactly when dynamic linker is inside pthread_mutex_* or is already
> holding a lock, and libunwind calls into dl_iterate_phdr() (NB; from
> the same thread already holding a lock or trying to change it), bad
> things will happen, usually a dead-lock.
> I'm currently entertaining the theory that either a crash from
> walking the elf headers in memory (without dl_iterate_phdr() and its
> locks) is less likely to crash than dead-locking inside the dynamic
> linker, or I should try to discard profile signals while inside the
> dynamic linker.
> Thoughts?

It's a shame that GLIBC calls its internal _dl_debug_state directly,
and not through _r_debug.r_brk; if it did, you could intercept library
events and maintain a private copy of the list.  You can still read
the list on your own via _r_debug.r_map, although it may be in an
inconsistent state.  All that gets you, though, is the public fields
of the link map.  I don't remember for sure, but I believe libunwind
does need the program headers - it would have to cache a copy, and
validate them using l_addr/l_name/l_ld.

> 2) Try to determine which frames are "varying", i.e. uses VLAs or
> alloca(). If there are none in the call stack, just cache incoming
> CFA vs. outgoing CFA difference for every call site, and unwind that
> way just that way. Otherwise revert to slow unwind at least until you
> get past the varying frames. Specifically, walk from the top, probing
> a cache for CFA delta + varying marker. If you make it all the way to
> the top with the cache, return call stack. If not, switch back to
> normal slow unwind, update cache, and go all the way to top.  Alas, I
> have currently no idea how to identify alloca/vla-using frames. Any
> ideas?

This information is not available, so you need a different approach.
Since in practice most x86-64 frames use either rsp or rbp for their
CFA, you could cache which it was and where, if anywhere, rbp was
saved.  That should be pretty quick.  If there are any data-dependent
instructions in the CFI to be concerned about you could also
maintain a "safe" flag in the unwinder, but I think the only one
is DW_CFA_val_expression (maybe DW_CFA_GNU_window_save but you won't
see that on x86-64).

Daniel Jacobowitz

reply via email to

[Prev in Thread] Current Thread [Next in Thread]