[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Chicken-hackers] [PATCH] Fix #1058 and (hopefully) #1045

From: Peter Bex
Subject: [Chicken-hackers] [PATCH] Fix #1058 and (hopefully) #1045
Date: Sun, 13 Oct 2013 13:51:01 +0200
User-agent: Mutt/

Hi all,

After digging into a problem observed with henrietta-cache on Alaric's
server, I filed #1058.  After some more digging, it turned out that
the OpenSSL egg creates mutexes which it waits for, which are put into
slot 11 of the waiting thread ("object on which thread is blocking").

According to the comments, slot 11 can hold either a pair of the FD and
the flags (:input/#t, :output/#f or :all) or a thread object.
This is wrong, because as you can see in srfi-18.scm, mutexes will be put
in there as well.  Furthermore, the create-fdset and fdset-set code
simply adds anything that's in slot 11 to the FD set for
poll()/select() regardless of whether it's a pair or something else.
This is fixed by checking whether it's a pair.

This bug was not caught by the paranoid checks for two reasons:
First, the scheduler has a (declare unsafe) statement (I didn't
see this until I added more checks and tried to cause tests to fail).
Secondly, the C_u_i_car and C_u_i_cdr accessors don't check whether
their argument is a pair, even in paranoid mode (they check whether
it's a block object, which a mutex is, so it'll just immediately
access the slots of the mutex, explaining the error message we get).

I've now added paranoid checks to these macros.  In doing so, I
noticed (because the compiler tests crashed) that the pair allocator
and accessors are abused for symbol table buckets (even though they
are their own distinct tagged type), so I've reworked that as well,
to get their own allocator and use block_item accessors instead of
pretending they're pairs.  C_u_i_car and C_u_i_cdr have been used
as a shorthand for C_block_item(x,0)/C_block_item(x,1) in quite a
few other places as well.  I think maybe these types should probably
eventually get their own (paranoia-checked) accessors as well, but
for now I haven't done that yet.

After adding these checks, I noticed that in a DEBUGBUILD, the
hash-table tests started failing consistently with the dreaded
"out of memory - heap full while resizing" error.  After spending the
weekend groveling through the garbage collector, I finally found a
very strange statement in runtime.c: after carefully calculating
the new heap size and ensuring it will fit both the current heap
and the stack, the heap size is HALVED, for no apparent reason!
There isn't even a comment stating why this is done.  Of course,
removing this line helped me get rid of this error message.
Hopefully this gets rid of it once and for all, and hopefully
this fixes the remaining cases in #1045 (except the "unknown
protocol, which is probably entirely unrelated).

I think that patches 0002 and 0003 should really go into stability.


Attachment: 0001-Add-paranoid-checks-to-C_u_i_car-and-C_u_i_cdr.patch
Description: Text document

Attachment: 0002-Fix-1058-never-add-mutex-objects-to-FD-lists-in-the-.patch
Description: Text document

Attachment: 0003-Remove-weird-unexplained-halving-of-new-heap-size-wh.patch
Description: Text document

reply via email to

[Prev in Thread] Current Thread [Next in Thread]