libunwind-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Libunwind-devel] Crash while running a cpu profiling tool with libunwin


From: Siddharth Bhojnagarwala
Subject: [Libunwind-devel] Crash while running a cpu profiling tool with libunwind 1.0.1
Date: Fri, 27 Jan 2012 01:03:00 +0000

Hello,

 

I am trying to use a cpu profiling tool (google perftool) which uses libunwind to get backtraces.  The code that is being profiled takes mutex locks all over the place.  When the profile is run, it crashes instantaneously (generally with some kind of illegal instruction).   See an example of crash below.

 

myhost# gdb /root/asp/bin/myexec core_7144_1327624664_myprogram

GNU gdb (GDB) 7.3.1

Copyright (C) 2011 Free Software Foundation, Inc.

License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

This is free software: you are free to change and redistribute it.

There is NO WARRANTY, to the extent permitted by law.  Type "show copying"

and "show warranty" for details.

This GDB was configured as "mips64-nlm-linux".

For bug reporting instructions, please see:

<http://www.gnu.org/software/gdb/bugs/>...

Reading symbols from /root/asp/bin/myexec...done.

 

warning: core file may not match specified executable file.

[Thread debugging using libthread_db enabled]

Core was generated by `/root/asp/bin/myexec'.

Program terminated with signal 4, Illegal instruction.

#0  0x0000005556c9d3bc in __sigprocmask (how=3, set=0x5589f15bf8, oset=0x0) at ../sysdeps/unix/sysv/linux/sigprocmask.c:66

66           ../sysdeps/unix/sysv/linux/sigprocmask.c: No such file or directory.

                in ../sysdeps/unix/sysv/linux/sigprocmask.c

(gdb) bt

#0  0x0000005556c9d3bc in __sigprocmask (how=3, set=0x5589f15bf8, oset=0x0) at ../sysdeps/unix/sysv/linux/sigprocmask.c:66

#1  0x000000555783cf10 in put_rs_cache () from /anroot/projects/tos_3party/.target/mips64-nlm-linux/lib/libunwind.so.8

#2  0x000000555783dfb4 in _ULmips_dwarf_find_save_locs () from /anroot/projects/tos_3party/.target/mips64-nlm-linux/lib/libunwind.so.8

#3  0x000000555783ecc8 in _ULmips_dwarf_step () from /anroot/projects/tos_3party/.target/mips64-nlm-linux/lib/libunwind.so.8

#4  0x0000005557837084 in _ULmips_step () from /anroot/projects/tos_3party/.target/mips64-nlm-linux/lib/libunwind.so.8

#5  0x00000055556ae04c in GetStackTraceWithContext(void**, int, int, void const*) () from /opt/thoroughbred/lib/libtcmalloc.so.0

#6  0x00000055557a6ce4 in ?? () from /opt/thoroughbred/lib/libprofiler.so.0

#7  0x00000055557a90e8 in ProfileHandler::SignalHandler(int, siginfo*, void*) () from /opt/thoroughbred/lib/libprofiler.so.0

#8  <signal handler called>

#9  0x000000555711cc44 in __lll_trylock (futex=<optimized out>) at ../ports/sysdeps/unix/sysv/linux/mips/nptl/lowlevellock.h:137

#10 __pthread_mutex_trylock (mutex=0x555ecda230) at pthread_mutex_trylock.c:65

#16 0x00000055571198c8 in start_thread (arg=<optimized out>) at pthread_create.c:299

#17 0x0000005556d50bbc in __thread_start () from /opt/thoroughbred/lib/libc.so.6

 

 

The Google Perftool README recognizes this problem.  Here is what it says.

… while tcmalloc itself works fine, the

cpu-profiler tool is unreliable: it will sometimes work, but sometimes

cause a segfault.  I'll explain the problem first, and then some

workarounds.

 

Note that this only affects the cpu-profiler, which is a

google-perftools feature you must turn on manually by setting the

CPUPROFILE environment variable.  If you do not turn on cpu-profiling,

you shouldn't see any crashes due to perftools.

 

The gory details: The underlying problem is in the backtrace()

function, which is a built-in function in libc.

Backtracing is fairly straightforward in the normal case, but can run

into problems when having to backtrace across a signal frame.

Unfortunately, the cpu-profiler uses signals in order to register a

profiling event, so every backtrace that the profiler does crosses a

signal frame.

 

In our experience, the only time there is trouble is when the signal

fires in the middle of pthread_mutex_lock.  pthread_mutex_lock is

called quite a bit from system libraries, particularly at program

startup and when creating a new thread.

 

The solution: The dwarf debugging format has support for 'cfi

annotations', which make it easy to recognize a signal frame.  Some OS

distributions, such as Fedora and gentoo 2007.0, already have added

cfi annotations to their libc.  A future version of libunwind should

recognize these annotations; these systems should not see any

crashses.

 

Why does libunwind choke if a signal to do profiling fires in middle of pthread_mutex_lock?  I am also not clear on the solution that gperf offers, can someone please advise me further on that?

 

Regards,

Sid


reply via email to

[Prev in Thread] Current Thread [Next in Thread]