[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal
Jörg F . Wittenberger
Re: [Chicken-hackers] Regarding #1564: srfi-18: (mutex-unlock) Internal scheduler error
03 Dec 2018 10:46:38 +0100
Thanks you so much Kon,
reviewing these logs helped to confirm my feelings.
Feelings, not findings. Yet.
Tinkering with these scheduler/srfi-18 issues again really made me feel bad
and sorry. In fact the anger has cost me the sleep of the better half of
the night. Still enrages me.
Whats going on here IMHO is that a lot of lifetime, your guys and mine, is
wasted. At the same time the code quality of the result is likely worse
that what I'm using as the source to cut out those patches.
As I can't outright proof this statement to you, let me recap the
background for a moment: Around a decade ago I ported a rather thread-heavy
thing (Askemos, which technically is something partially inspired by
Erlang, bearing similarities to Termite - except that those processes are
all made persistent and the states is replicated and synchronized in
byzantine agreement over a part of the network; you might be able to
imagine that this is really stressing the threading capabilities of the
language in use) from rscheme to chicken. The code was at that time grown
for ~7yrs; that's almost 100 modules, which took some months to port. ...
...Only to learn that the threading in chicken was not at all up for the
job. Hence I spend a few more weeks fixing that one. Including adding an
prio queue for timeout- and fd-list.
What I could NOT produce where test cases for each of the bugs (1231, 1232,
1255, 1564 - like these are not all) I fixed in the process.
Nor was is feasible to fix them one-by-one. (Yesterday evening I failed to
properly backport the fix for 1564 into the ugly code implementing the
timeout queue -- while asking myself why the hell it is useful; this queue
should be replaced with a better version anyway.)
The result I posted on chicken-users at that time. It was a complex fix.
Sure. But those where sort of interrelated bugs.
Then for about seven years I sadly maintained a chicken fork (which I'm
still using in production) for these differences in order to be able to use
chicken at all. Since 4.12 it is at least _possible_ to run this code on
stock chicken. Partly because I changed my code to avoid triggering bugs
So for me the question remains: wouldn't it be much, much more efficient to
work sort-of hand-in-hand with one of the core developers, or maybe on the
list to get the remaining things (bugs and improvements) fixed and
It would be so much more satisfying to me to actually produce code I could
approve myself than backport yet another hotfix -- creating a result in the
process I take issues with.
(((Going into details, I'd probably do the prio-queue different today as I
learned about chickens performance details. And I'm ready to do so. But at
least I'd like to be allowed to use a prio queue using a proper interface
than kludging inline handling of a linear list into well tested code --
likely creating fresh bugs in the process.)))
On Dec 2 2018, Kon Lovett wrote:
see attached git (C4) & svn (C5) logs
#(in C4 core local repo)
git log --follow -p -- srfi-18.scm >srfi-18.log
#(in C5 svn local repo)
svn log --diff trunk >srfi-18_trunk-diff.log
On Dec 2, 2018, at 1:19 PM, Jörg F. Wittenberger
Thanks for the replies,
chicken-install -r srfi-18 ; did the trick already
I should have stated that that's what I have, what I've been looking
for was the git history. I wonder for some statements why the hell they
are there at all. Two possible reasons: a) I cleaned them up for being
obsolete (due to former changes I made) b) removed since I touched the
file, which begs the question "why where those added".
Never mind. I can proceed at least.
On Dec 2 2018, Kon Lovett wrote:
well, that shows me. ;-)
trying to track down why
#497 $ chicken-install -r srfi-18
mapped (srfi-18) to ()
On Dec 2, 2018, at 10:42 AM, Kon Lovett <address@hidden> wrote:
C5 evicted srfi-18, along w/ srfi-1, 13, 14, & 69, to the egg store.
On Dec 2, 2018, at 10:39 AM, Jörg F. Wittenberger
<address@hidden> wrote: Hi all, when I tried to
reply in a timely manner I apparently sent out a link to a broken
file. Sorry for that. Just wanted to see if I could create a patch
for the current master. For this I need srfi-18 egg source too. Just
I can't find it. Jöry On Nov 30 2018, Jörg F. Wittenberger wrote:
On Nov 30 2018, megane wrote:
Here's another version that crashes quickly with "very high
24 Error: (mutex-unlock) Internal scheduler error: unknown thread
state 25 #<thread: thread1> 26 ready
This bears an uncanny resemblance to scheduler issues I've been
fighting a long ago. Too long to ago.
Too long ago. But it feels wrong. We'd rather make sure there is no
ready thread in the queue waiting for a mutex in the first place.
Diffing the changes I maintained quite a while back
you will find that I added a ##sys#thread-clear-blocking-state!
Towards the end of scheduler.scm and used it for consistency
whereever I ran into not-so-clean unlocks. Now this is still an
invasive change. But looking at the source of scheduler and srfi-18
in chicken 5 right now, I can't fight the feeling that it is working
around the missing changes at several places. Best /Jörg
mailing list address@hidden
--- A fix
Just allow the 'ready state for threads in mutex-unlock!
Is this a correct fix?
Chicken-hackers mailing list