help-smalltalk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Help-smalltalk] Re: Playing with the VM Limits, crash on many processes


From: Paolo Bonzini
Subject: [Help-smalltalk] Re: Playing with the VM Limits, crash on many processes
Date: Sun, 21 Nov 2010 20:14:11 +0100
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.12) Gecko/20101103 Fedora/1.0-0.33.b2pre.fc14 Lightning/1.0b3pre Mnenhy/0.8.3 Thunderbird/3.1.6

On 11/21/2010 04:10 PM, Paolo Bonzini wrote:
where the Scheduler class is a heavily butchered version of Delay. :)

Interestingly, inlining the two methods in the Eval makes the testcase
work, so it's probably something related to contexts.

It's a memory corruption due to running out-of-memory and not detecting it.

FWIW, here are my debugging steps:

1) after some fruitless attempts to get to the point of corruption with gdb, I added this patch

diff --git a/libgst/oop.c b/libgst/oop.c
index f5b885b..4c15f57 100644
--- a/libgst/oop.c
+++ b/libgst/oop.c
@@ -1076,6 +1076,7 @@ _gst_global_gc (int next_allocation)
   int old_limit;

   _gst_mem.numGlobalGCs++;
+  _gst_mem.numScavenges = 0;

   old_limit = _gst_mem.old->heap_limit;
   _gst_mem.old->heap_limit = 0;
@@ -2032,10 +2033,10 @@ _gst_copy_an_oop (OOP oop)
       obj = OOP_TO_OBJ (oop);
       pData = (OOP *) obj;

-#if defined(GC_DEBUG_OUTPUT)
-      printf (">Copy ");
+      if  (_gst_mem.numGlobalGCs == 20 && _gst_mem.numScavenges == 249) {
+      printf (">Copy %p ", ((gst_object)0x7ffff6dc87a0)->objClass);
       _gst_display_oop (oop);
-#endif
+      }

 #if defined (GC_DEBUGGING)
       if UNCOMMON (!IS_INT (obj->objSize))

I easily got the numbers (20/249/0x7ffff6dc87a0) from the breakpoints I was using in gdb. The debugging output wasn't too long and had

>Copy 0x7fc75b361920 0x7fc75f495300   0x7ffff7268010  ...
>Copy (nil)   ...

which showed that OOP 0x7fc75f495300 was being copied at the time of the corruption.


2) I put a breakpoint on the call to _gst_display_oop, conditional on printing the OOP that I got from the debugging output.


3) At the breakpoint, I put a watchpoint on *(void **)0x7ffff6dc87a0. I remembered hardware watchpoints didn't work so I used a software one. HW watchpoints indeed didn't work because the corruption happened in kernel mode (due to one mmap overwriting another):

Watchpoint 3: *(void **)0x7ffff6dc87a0

Old value = (void *) 0x23
New value = (void *) 0x0
0x0000003bda0dfffa in mmap64 () at ../sysdeps/unix/syscall-template.S:82
82      T_PSEUDO (SYSCALL_SYMBOL, SYSCALL_NAME, SYSCALL_NARGS)
(gdb) bt
#0  0x0000003bda0dfffa in mmap64 ()
#1  0x00007ffff7d684ac in anon_mmap_commit (base=<value optimized out>,
    size=<value optimized out>) at ../../libgst/sysdep/posix/mem.c:227
#2  0x00007ffff7d6684b in heap_sbrk_internal (hdp=0x7fffd6d82000,
    size=262144) at ../../libgst/heap.c:235
#3  0x00007ffff7d66692 in _gst_heap_sbrk (hd=0x7fffd6d83000 "@",
    size=262144) at ../../libgst/heap.c:187
(gdb) up 3
#3  0x00007ffff7d66692 in _gst_heap_sbrk (hd=0x7fffd6d83000 "@",
    size=262144) at ../../libgst/heap.c:187
187       return heap_sbrk_internal (hdp, size);
(gdb) p hdp
$5 = (struct heap *) 0x7fffd6d82000
(gdb) p *$
$6 = {areasize = 536870912, base = 0x7fffd6d82000 "",
  breakval = 0x7ffff6dc3000 "",  top = 0x7ffff6dc3000 ""}
(gdb) p hdp->breakval - hdp->base
$7 = 537137152

So the heap had overflowed.

Trivial patch follows:

diff --git a/libgst/heap.c b/libgst/heap.c
index 25d7f50..1f64fb2 100644
--- a/libgst/heap.c
+++ b/libgst/heap.c
@@ -218,6 +218,18 @@ heap_sbrk_internal (struct heap * hdp,
     }
   else if (hdp->breakval + size > hdp->top)
     {
+      if (hdp->breakval - hdp->base + size > hdp->areasize)
+        {
+          if (hdp->breakval - hdp->base == hdp->areasize);
+            {
+              /* FIXME: a library should never exit!  */
+              fprintf (stderr, "gst: out of memory allocating %d bytes\n",
+                       size);
+              exit (1);
+            }
+          size = hdp->areasize - (hdp->breakval - hdp->base);
+        }
+
       moveto = PAGE_ALIGN (hdp->breakval + size);
       mapbytes = moveto - hdp->top;
       mapto = _gst_osmem_commit (hdp->top, mapbytes);

Paolo



reply via email to

[Prev in Thread] Current Thread [Next in Thread]