libunwind-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Libunwind-devel] [RFC][patch] Fix crash in _ULx86_64_tdep_trace when si


From: Paul Pluzhnikov
Subject: [Libunwind-devel] [RFC][patch] Fix crash in _ULx86_64_tdep_trace when sigaltstack is too far away
Date: Tue, 8 Nov 2011 10:11:37 -0800

Greetings,

One of my applications crashed in _ULx86_64_tdep_trace on line 504 here:

// src/x86_64/Gtrace.c

503  cfa = cfa + f->cfa_reg_offset;
504  ACCESS_MEM_FAST(ret, c->validate, d, cfa + f->rbp_cfa_offset + dRIP, rip);

I've finally tracked it down to f->rpb_cfa_offset being incorrect.

The problem is that {rsp,rbp}_cfa_offset only have 15 bits, but for
SIGRETURN frame they are filled with:

// src/x86_64/Gstash_frame.c

    f->cfa_reg_offset = d->cfa - c->sigcontext_addr;
    f->rbp_cfa_offset = DWARF_GET_LOC(d->loc[RBP]) - d->cfa;
    f->rsp_cfa_offset = DWARF_GET_LOC(d->loc[RSP]) - d->cfa;

The problem is that the delta here can be arbitrarily large when
sigaltstack is used, and can easily overflow the 15 and 30-bit fields.

Unfortunately I did not succeed it creating a stand-alone test case showing
this crash.

Several fixes are possible:

1. we can simply turn validation on for SIGRETURN. Something like:

--- src/x86_64/Gtrace.c 2011-11-05 12:08:30.000000000 -0700
+++ src/x86_64/Gtrace.c.new     2011-11-08 09:45:23.000000000 -0800
@@ -501,11 +501,11 @@ tdep_trace (unw_cursor_t *cursor, void *
         is stored at some unknown constant offset off inner frame's
         CFA.  We determine the actual offset from DWARF unwind info. */
       cfa = cfa + f->cfa_reg_offset;
-      ACCESS_MEM_FAST(ret, c->validate, d, cfa + f->rbp_cfa_offset +
dRIP, rip);
+      ret = dwarf_get (d, DWARF_MEM_LOC (d, cfa + f->rbp_cfa_offset +
dRIP), &rip);
       if (likely(ret >= 0))
-        ACCESS_MEM_FAST(ret, c->validate, d, cfa + f->rbp_cfa_offset, rbp);
+       ret = dwarf_get (d, DWARF_MEM_LOC (d, cfa + f->rbp_cfa_offset), &rbp);
       if (likely(ret >= 0))
-        ACCESS_MEM_FAST(ret, c->validate, d, cfa + f->rsp_cfa_offset, rsp);
+       ret = dwarf_get (d, DWARF_MEM_LOC (d, cfa + f->rsp_cfa_offset), &rbp);

       /* Resume stack at signal restoration point. The stack is not
          necessarily continuous here, especially with sigaltstack(). */


2. we can detect overflow in Gstash_frame.c, and mark the frame as impossible
   to handle in fast trace:

--- src/x86_64/Gstash_frame.c   2011-10-05 15:52:33.000000000 -0700
+++ src/x86_64/Gstash_frame.c.new       2011-11-08 09:49:59.000000000 -0800
@@ -85,6 +85,15 @@ tdep_stash_frame (struct dwarf_cursor *d
     f->cfa_reg_offset = d->cfa - c->sigcontext_addr;
     f->rbp_cfa_offset = DWARF_GET_LOC(d->loc[RBP]) - d->cfa;
     f->rsp_cfa_offset = DWARF_GET_LOC(d->loc[RSP]) - d->cfa;
+    if (f->cfa_reg_offset != d->cfa - c->sigcontext_addr
+       || f->rbp_cfa_offset != DWARF_GET_LOC(d->loc[RBP]) - d->cfa
+       || f->rsp_cfa_offset = DWARF_GET_LOC(d->loc[RSP]) - d->cfa)
+      {
+       Debug (4, " sigreturn frame overflow\n");
+       f->frame_type = UNW_X86_64_FRAME_OTHER;
+       return;
+      }
+
     Debug (4, " sigreturn frame rbpoff %d rspoff %d\n",
           f->rbp_cfa_offset, f->rsp_cfa_offset);
   }

3. we can abandon DWARF, record the address of sigcontext in the 60
bits remaining,
   and extract registers directly from there (patch attached).


Fix 1 pessimizes fast trace for every signal -- probably very undesirable
for a profiler, so Lassi would not like it.

Fix 2 pessimizes fast trace only when sigaltstack is far enough away.
Undesirable for profiler with sigaltstack (which covers many Google apps).

I don't see any major disadvantages for fix 3.

Tested on Linux/x86_64 (Ubuntu 10.04). No new failures.

Comments?

Thanks,
-- 
Paul Pluzhnikov

Attachment: libunwind-crash-in-fasttrace-20111108.txt
Description: Text document


reply via email to

[Prev in Thread] Current Thread [Next in Thread]