libunwind-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Libunwind-devel] UNW_EINVAL stepping past _L_lock_686 in pthread_mu


From: Jared Cantwell
Subject: Re: [Libunwind-devel] UNW_EINVAL stepping past _L_lock_686 in pthread_mutex_lock
Date: Tue, 16 Sep 2014 21:43:32 -0600

Some additional information.  gdb is able to walk the stack fine (see
below).  I ran with --enable-debug and UNW_DEBUG_LEVEL=99 and got the
traces below for the last unw_step that fails.  I'm working on parsing
the output and comparing it to unw_step calls that succeed and against
the readelf -wF output, but stack unwinding is new to me so any
pointers or things to look for would be much appreciated.

(gdb) bt
#0  0xb7fdd424 in __kernel_vsyscall ()
#1  0xb7fba5a2 in __lll_lock_wait () from /lib/i386-linux-gnu/libpthread.so.0
#2  0xb7fb5ead in _L_lock_686 () from /lib/i386-linux-gnu/libpthread.so.0
#3  0xb7fb5cf3 in pthread_mutex_lock () from /lib/i386-linux-gnu/libpthread.so.0
#4  0x08048d36 in thread_start () at unwind_repro.cpp:78
#5  0xb7fb3d4c in start_thread () from /lib/i386-linux-gnu/libpthread.so.0
#6  0xb7dd4dde in clone () from /lib/i386-linux-gnu/libc.so.6

UNW_DEBUG_LEVEL=99 output:

_L_lock_686(_L_lock_686)
 >_ULx86_step: (cursor=0xb7400930, ip=0xb771fead)
                >get_rs_cache: acquiring lock
              >_ULx86_dwarf_find_proc_info: looking for IP=0xb771feac
               >_ULx86_dwarf_callback: checking , base=0x0)
               >_ULx86_dwarf_callback: checking , base=0xb7749000)
               >_ULx86_dwarf_callback: checking
/lib/i386-linux-gnu/libpthread.so.0, base=0xb7717000)
               >_ULx86_dwarf_callback: found table
`/lib/i386-linux-gnu/libpthread.so.0': segbase=0xb7728a74, len=684,
gp=0xb772eff4, table_data=0xb7728a80
               >lookup: e->start_ip_offset = ffff995c
               >lookup: e->start_ip_offset = ffff69b4
               >lookup: e->start_ip_offset = ffff82cc
               >lookup: e->start_ip_offset = ffff6d7c
               >lookup: e->start_ip_offset = ffff796c
               >lookup: e->start_ip_offset = ffff7430
               >lookup: e->start_ip_offset = ffff745c
               >lookup: e->start_ip_offset = ffff744c
               >lookup: e->start_ip_offset = ffff743e
               >_ULx86_dwarf_search_unwind_table: ip=0xb771feac,
start_ip=0xffff7430
 >_ULx86_dwarf_search_unwind_table: e->fde_offset = 1b6c, segbase =
b7728a74, debug_frame_base = 0, fde_addr = b772a5e0
            >_ULx86_dwarf_extract_proc_info_from_fde: FDE @ 0xb772a5e0
               >_ULx86_dwarf_extract_proc_info_from_fde: looking for
CIE at address b7729694
               >parse_cie: CIE parsed OK, augmentation = "zR", handler=0x0
               >_ULx86_dwarf_extract_proc_info_from_fde: FDE covers IP
0xb771fea4-0xb771feb2, LSDA=0x0
               >run_cfi_program: CFA_def_cfa r4+0x0
               >run_cfi_program: CFA_same_value r2
               >run_cfi_program: CFA_advance_loc to 0xb771fedc
                >put_rs_cache: unmasking signals/interrupts and releasing lock
apply_reg_state: loc (b74012cc)
                >access_mem: mem[b74012cc] -> b771fead
apply_reg_state: ip and cfa unchanged; stopping here (ip=0xb771fead)
               >_ULx86_dwarf_step: returning -7

On Mon, Sep 15, 2014 at 3:11 PM, Jared Cantwell
<address@hidden> wrote:
> I apologize if this is a duplicate message.  I sent this earlier, but
> I wasn't a member of the list, so I strongly suspect it got silently
> dropped on me since I haven't seen it appear in the archive.
>
> I'm working on a tool that sends a signal to all running threads and
> records their backtraces using libunwind.  This is useful for getting
> a summary of all threads without having to break in with gdb and halt
> the entire program.  However, libunwind seems to have trouble walking
> the stack of threads that are waiting in pthread_mutex_lock-- unw_step
> returns UNW_EINVAL.  I've found similar posts on this list (including
> patches), and my version of libunwind is running with those patches.
> Is this a known issue?  Is there a way to work around it?
>
> I've included and attached a program that will reproduce the issue
> using signals.  I am running with libunwind-1.1.
>
> I am compiling with g++ (4.6.3):
> g++ -g unwind_repro.cpp -lpthread -lunwind
>
> Any help or direction is much appreciated.  Let me know if I can
> provide more information to help.
>
> ~Jared
>
> --------------------------------------------------------------------------------------
> #define UNW_LOCAL_ONLY
> #include <libunwind.h>
>
> #include <pthread.h>
> #include <iostream>
> #include <cstdlib>
> #include <dlfcn.h>
> #include <cxxabi.h>
> #include <cassert>
>
> /**
>  * This is a test program that reproduces an issue with libunwind
>  * unw_step return UNW_EINVAL if the thread is currently waiting on a
>  * pthread_mutex_t acquisition.  It gets to __lll_lock_wait and
>  * then _L_lock_686, but cannot walk back farther than that.
>  *
>  * Signals are used to trigger the stack walk while the thread
>  * is hung on lock acquisition.
>  */
>
> pthread_t new_thread;  // background thread we will signal to get its 
> backtrace
> pthread_mutex_t lock;  // lock that will be held when backtrace is attempted
>
> // walk the stack with calls to unw_step
> void get_current_stack()
> {
>     unw_cursor_t cursor;
>     unw_context_t uc;
>     unw_word_t offp;
>     char procname[100];
>
>     unw_getcontext(&uc);
>     unw_init_local(&cursor, &uc);
>
>     while(true)
>     {
>         int res = unw_step(&cursor);
>
>         if (res == 0)
>             break;
>
>         // we don't expect this to happen in this test
>         if (res < 0)
>         {
>             std::cout << "FAIL: unw_step returned error; res=" << res
> << std::endl;
>             assert(res >= 0);
>             break;
>         }
>
>         unw_get_proc_name(&cursor, procname, sizeof(procname), &offp);
>
>         int status;
>         char *fixedName = abi::__cxa_demangle(procname, NULL, NULL, &status);
>
>         if (fixedName == NULL)
>             fixedName = procname;
>
>         std::cout << fixedName << std::endl;
>
>         // Free fixedName if it was allocated
>         if (fixedName != procname)
>             free(fixedName);
>     }
> }
>
> // background thread that first gets the current
> // stack to make sure the code works, and then
> // takes the lock so a signal can be sent to get
> // the backtrace while the lock is held.
> void* thread_start(void*)
> {
>     new_thread = pthread_self();
>
>     std::cout << "--- not in pthread_mutex_lock ---" << std::endl;
>     get_current_stack();
>     std::cout << "---------------------------------" << std::endl;
>
>     pthread_mutex_lock(&lock);
>     pthread_mutex_unlock(&lock);
> }
>
> // signal handler that is executed while the thread
> // above is holding the lock in order to demonstrate
> // the issue.
> void sig_handler(int signum)
> {
>     std::cout << "--- IN pthread_mutex_lock ---" << std::endl;
>     get_current_stack();
>     std::cout << "---------------------------------" << std::endl;
> }
>
> int main(int argc, char *argv[])
> {
>     // initialize and take the global lock so that the background
>         // thread hangs on it indefinitely so we can send a signal to it.
>     pthread_mutex_init(&lock, NULL);
>     pthread_mutex_lock(&lock);
>
>     // register the signal handler that will be invoked by
>     // the locked-up thread
>     signal(SIGUSR2, sig_handler);
>
>     // launch the background thread and give it a moment to start
>     // waiting on the lock acquisition (where it will hang).
>     pthread_t thread_id;
>     pthread_create(&thread_id, NULL, &thread_start, NULL);
>     sleep(2);
>
>     // now send the signal to walk the stack while the lock is held
>     pthread_kill(new_thread, SIGUSR2);
>
>     sleep(2);
>     pthread_mutex_unlock(&lock);
>     pthread_mutex_destroy(&lock);
>     pthread_join(thread_id, NULL);
>
>     std::cout << "PASS: Full stack was walked." << std::endl;
> }



reply via email to

[Prev in Thread] Current Thread [Next in Thread]