Re: [Chicken-hackers] Scheduler fdset assert patch

Dear Jonathan, what do you think about creating a wiki page to explain the scheduler? This way the code is kept clean and your high-level explanation is stored for people interested in it.

Em 14/07/2015 17:19, "Jonathan Chan" <address@hidden> escreveu:

Hi Peter,

On Tue, Jul 14, 2015, at 12:09 PM, Peter Bex wrote:
> Hello Jonathan,
>
> Wow, that's a long comment! Thanks for putting so much time into this.
> This is not exactly what I had in mind though. I would prefer it if
> the comment wouldn't take up that much space. For example, the
> explanation of "reason" and "state" doesn't need to be three lines
> of prose. Instead, I'd simply add a short inline comment like so:
>
> (set! ##sys#interrupt-hook
> (let ((oldhook ##sys#interrupt-hook))
> (lambda (reason state) ; reason = signal #, state = proc or context
> ...
>
> This is much easier on the eyes and provides the information exactly
> when and where you need it while reading over the code. Renaming the
> variables to clarify their meaning might be even better, but that's
> debatable in this case. The main problem with such a big chunk of
> comment up front is that it requires the reader to first read this,
> memorize it, scroll down, and figure out how the code maps to the
> comment. For a high-level overview that's okay, but this is the wrong
> level of description to warrant such a long comment, I think. Cutting
> it down might help.
>
> Also, signing the comment with your name and date might not be the wisest
> idea. I've seen this kind of thing in code before, and you often see
> that people may want to edit the comments, which would distort your
> original message (meaning it looks like you signed something you didn't
> write). So what usually happens in these cases is that others append
> to it, creating a sort of alternate history log, which gets messy
> quickly.

I agree about the comment for reason/state and regarding the signature,
and I've removed a few lines. I'm not sure the rest should be removed,
though, especially if the plan is not to put it into a separate file.
Without it the future editor of the code will have to do a bit of
research on their own.

> Small nitpick: a single-semicolon comment will be auto-indented by Emacs
> as a "margin" comment, which pushes it to the right. Since most Schemers
> will be using Emacs to edit this code, it's better to use two (or three,
> if the comment is important enough) leading semicolons.

I've added an extra semicolon to the beginning of the lines. But you're
talking to a vim user :)

> Invoking external programs like "pgrep" or "wmic" is brittle and if you
> can, you should avoid it. According to the documentation, PROCESS should
> already return the process id as the third value, so I think you can just
> use that.

Using pgrep (or wmic on Windows) is necessary to get the processes'
child processes (why the procedure is named process-children :)
Sending signals to both the parent process and the children is the only
way I've found so far to pretty consistently reproduce the assert bug on
my computer for those test cases.

> Is the split of ##sys#schedule and ##sys#schedule* needed? I don't see
> why these have to be separate procedures.

It's not necessary, good catch. It's left over from when I was still
debugging.

> Finally, you didn't reply to my question regarding the change in
> create-fdset where the original code contained the following:
>
> (let ((p (##sys#slot t 11)))
> (when (pair? p) ; (FD . RWFLAGS)? (can also be mutex or thread)
> (fdset-set fd (cdr p)))))
>
> Your code is just taking the cdr without first checking it's a pair.

I replied on IRC but you might not have seen. Checking for pair and
ignoring the thread if its not blocked on a blocked object was the
immediate cause of the scheduler interrupt bug and a symptom of buggy
signal interrupt handling in general. Not calling fdset-set on the
thread's fd without removing the thread from the fd-list will cause the
scheduler assert error if the thread was the only one blocking on the fd
because the fd-list will be out-of-sync for the fdset for poll. There
doesn't seem to be a non-buggy situation in which the block object of a
thread in fd-list should be changed from a file. create-fdset and
##sys#schedule itself should never be called while in the interrupt hook
with the patch. Calling ##sys#schedule instead adds the interrupt
handler thread to the ready queue then yields back to us to return to
the code that called the interrupt hook, which preserves the semantics
of ##sys#schedule as used by the library code I've gone over and the
behavior of signal handler code that uses srfi-18 functionality while
allowing the signal handler to run 'immediately,' as documented in the
long comment.

Hope this helps! I've attached a patched patch. :)

Best regards,
--
Jonathan Chan
address@hidden

_______________________________________________
Chicken-hackers mailing list
address@hidden
https://lists.nongnu.org/mailman/listinfo/chicken-hackers

From:	Arthur Maciel
Subject:	Re: [Chicken-hackers] Scheduler fdset assert patch
Date:	Tue, 14 Jul 2015 23:05:53 -0300