libunwind-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Libunwind-devel] Re: [patch 2/2] Allow caller to block signals.


From: Arun Sharma
Subject: Re: [Libunwind-devel] Re: [patch 2/2] Allow caller to block signals.
Date: Fri, 25 Sep 2009 19:09:25 -0700

On Fri, Sep 25, 2009 at 5:28 PM, Paul Pluzhnikov <address@hidden> wrote:
> On Fri, Sep 25, 2009 at 2:04 PM, Arun Sharma <address@hidden> wrote:
>
>> This leads to a deadlock.
>
> This is one of several instances of this deadlock.
> The more common (for us) instance has noting to do with signals at all:
>
>  T1                              T2
>   calls dlopen                   calls malloc; takes malloc lock
>    takes loader lock              calls unwind to record allocation trace
>      calls calloc                  calls dl_iterate_phdr
>       waits for malloc lock         tries to get loader lock
>

True. This explains why David was trying to implement a cache of
dl_iterate_phdr() and get an API from glibc that could be used to
check if the cached data has changed.

The cache could then be updated via dl_iterate_phdr() only when it is
safe to do so (eg: in a helper thread).

>> Which brings up the question of why not drop this sigprocmask
>> completely since it's not helping?
>
> I believe the signal *is* necessary when (default) --enable-block-signals=yes,
> for the same reason it is necessary elsewhere -- without it you can get
> interrupted in pthread_mutex_lock while the lock is in inconsistent state,
> and then crash trying to re-acquire it.

Calling anything that's not explicitly declared async signal safe from
a signal handler is illegal. So code that calls pthread_mutex_lock()
from the signal handler needs to be fixed.

In all other cases, libunwind wants to block signals mainly because it
wants to avoid the "libunwind gets interrupted by libunwind while in a
critical section" case. And you're right - it applies to the
dl_iterate_phdr() case as well.

libunwind
  dl_iterate_phdr
     signal
       libunwind
         dl_iterate_phdr

is undesirable.

I still see the call to dl_iterate_phdr() to be a problem and
surrounding it with sigprocmask() doesn't fix the problem fully. But
because it eliminates the deadlock above, I'll apply this patch as
well.

>
>> How about a different approach to the problem? Abort the unwind in the
>> unlikely event that we were holding the dl lock? Not sure if glibc
>> exposes dl_load_lock.
>
> You wish :-(

Thanks for the explanation.

>From the 2004 email threads, it looked like there was agreement on the
general outline of the cache invalidation API, but difficulties with
the actual implementation. Perhaps that's the best bet for users who
can't modify their glibc.

 -Arun




reply via email to

[Prev in Thread] Current Thread [Next in Thread]