guix-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: The Shepherd on Fibers


From: Philip McGrath
Subject: Re: The Shepherd on Fibers
Date: Mon, 28 Mar 2022 20:14:46 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0

Hi,

On 3/23/22 18:36, Ludovic Courtès wrote:
Hello Guix!

I have pushed a ‘wip-fibers’ branch of the Shepherd:

   https://git.savannah.gnu.org/cgit/shepherd.git/log/?h=wip-fibers

The goal is to make shepherd (the daemon) use Fibers¹ for concurrency.


Exciting!

As I wrote in <https://lists.gnu.org/archive/html/guix-devel/2021-05/msg00283.html>, "I haven't programmed with it in Guile at all, but, from my experience using Racket's Concurrent ML system, I think it is the right foundation for a supervisor daemon."

I still haven't programmed with Guile Fibers, and looking briefly at the diff of the "wip-fibers" branch was the first time I'd looked at the Shepherd's implementation, but I had a couple thoughts.

First,

Fibers is used in a single-threaded fashion, which is the main
constraint for shepherd since it forks.  That also means that fibers
cannot be preempted, so it’s fully cooperative scheduling.


Would it be feasible for shepherd *not* to fork? Or only to fork in a way that cooperates with fibers?

Obviously forking is pretty ubiquitous, but I the 2019 paper "A fork() in the road"[1] fairly convincing in its argument that

fork has lost its classic simplicity; it today impacts all the
other operating system abstractions with which it was once
orthogonal. Moreover, a fundamental challenge with fork is
that, since it conflates the process and the address space in
which it runs, fork is hostile to user-mode implementation
of OS functionality, breaking everything from buffered IO
to kernel-bypass networking. Perhaps most problematically,
fork doesn’t compose—every layer of a system from the kernel
to the smallest user-mode library must support it.

I consider a Concurrent ML system a "user-mode implementation of OS functionality": indeed, an early version of Racket's thread system (where thread = fiber) is one of the examples in Flatt, Findler, Krishnamurthi, & Felleisen, "Programming Languages as Operating Systems (or Revenge of the Son of the Lisp Machine)" (ICFP 1999)[2].

More concretely, preemption is a big benefit of fibers.

Racket's approach also addresses this part:

There’s one catch: Fibers is currently Linux-only.  The good news is
that work has been done to port it to other kernels via libevent².
Until it is merged, we could keep using the Shepherd 0.8 on GNU/Hurd.


Racket has a cross-platform C library, "rktio"[3], for accessing os-specific functionality. It was refactored into its current form in 2017 as an early step toward Racket-on-Chez: while it happens to provide exactly the functionality Racket needs, it no longer is specific to Racket or any particular implementation thereof. That includes everything needed to implement the Concurrent ML system and nonblocking IO on a variety of OSes and kernels.

In particular, it implements—IIUC primarily in "rktio_process.c"—abstractions (over `fork` or alternatives) to start a new process running something, with environment, stdin, stdout, and stderror wired up ports in the sense of `current-{input,output,error}-port`, and use the Concurrent ML system to monitor its exit status, send it signals, etc. The Racket-level API is documented at [4].

I spend as little time as possible with these C-level sorts of issues, and I particularly don't know about special considerations for PID 1, but I hope there would be some way to meet Shepherd's needs without interfering with Fibers.

Second, I'm a little uneasy about `unwind-protect`:

```
+(define-syntax-rule (unwind-protect body ... conclude)
+  "Evaluate BODY... and return its result(s), but always evaluate CONCLUDE
+before leaving, even if an exception is raised.
+
+This is *not* implemented with 'dynamic-wind' in order to play well with
+delimited continuations and fibers."
+  (let ((conclusion (lambda () conclude)))
+    (catch #t
+      (lambda ()
+        (call-with-values
+            (lambda ()
+              body ...)
+          (lambda results
+            (conclusion)
+            (apply values results))))
+      (lambda args
+        (conclusion)
+        (apply throw args)))))
```

though maybe it's used only internally and doesn't have to deal with the general case: I'm not an expert by any means, but my understanding (from a Racket mailing list thread at [5], continued at [6]) is that dealing with the general case may be something of an open research question.

As an example of the sort of problem, what if the `body`s evaluate normally, but an exception is raised in the dynamic extent of `(conclusion)`, causing `(conclusion)` to run again? One possible mitigation would be to `set!` a local variable before the normal `(conclusion)` and check it in the exception handler.

More generally, of course, `body ...` could escape by some mechanism other than a raising an exception.

If you were writing Racket, I would recommend `(call-with-continuation-barrier (λ () body ...))`—logically, `body ...` isn't re-entrant anyway—but I see from [7] that Guile's continuation barriers are barriers to the fibers' context switches.

The key difference between Guile and Racket is that Guile's Fibers are just a library, without any special privileges, and use the same control operators that are available to user code. I think that may be unique among the Concurrent ML implementations I'm aware of.

Why is that a problem? Well, in many ways I think it's quite cool! But a problematic aspect is that (from the discussion at [5]):


On 11/30/19 10:23, Matthew Flatt wrote:
You're trying to implement `amb` where a client can mix `amb` and
`dynamic-wind` and get sensible behavior, right?

The `dynamic-wind` operation certainly interferes with building new
control forms. Racket threads and other control forms that need to
interact a certain way with `dynamic-wind` end up being built in at the
same level as `dynamic-wind`.

At Sat, 30 Nov 2019 06:15:16 -0600, Alexis King wrote:
Is there any way to do that with Racket’s continuation machinery,

There's not a safe way. In many cases, Racket lets you write new things
that have the power of built-in through unsafe APIs --- and it turns
out that there are unadvertised procedures (provided by the primitive
`#%unsafe` module) for this particular case:

[...]

At Sat, 30 Nov 2019 06:15:16 -0600, Alexis King wrote:
Also, is this kind of thing discussed anywhere in the literature?

I don't know of any published work on this topic (so let me know if you
find something!). As you probably have seen already, our ICFP'07 paper
just points out that `dynamic-wind` causes problems, but doesn't try to
hold `dynamic-wind` itself responsible for those problems.

An opinion and some pointers to newsgroup discussions:

   http://okmij.org/ftp/continuations/against-callcc.html#dynamic_wind


On 11/30/19 21:38, Alexis King wrote:
> Yes, most of the relevant discussions I’ve found have been
> about Lisp’s `unwind-protect` and how it relates to Scheme’s
> `dynamic-wind`. It’s alluded to in Matthias’s original POPL ’88
> paper and mentioned explicitly in the following LASC paper, but
> neither address this particular issue. I also found Dorai
> Sitaram’s “Unwind-protect in portable Scheme”[1] and the
> related commentary by Kent Pitman[2] and Will Clinger[3], but
> though obviously related, both Sitaram and Clinger seem to
> imagine a system where the author of the program defines all
> control operations, so there isn’t as much of a problem. I’ve
> also been trying to find if any of these issues are discussed
> in any algebraic effects literature, but I haven’t found
> anything there yet, either.
>
> [1]: http://www.ccs.neu.edu/~dorai/uwcallcc/uwcallcc.html
> [2]: http://www.nhplace.com/kent/PFAQ/unwind-protect-vs-continuations-overview.html
> [3]: http://www.ccs.neu.edu/home/will/UWESC/uwesc.sch
>

Maybe it would be enough for this case for Fibers to provide variants of `dynamic-wind` and/or `call-with-continuation-barrier` that cooperate with the Fibers implementation. I don't know if that would be possible of not—in addition to my limited familiarity with Fibers, I have not read all of the related literature that Alexis, Matthew, and Matthias Felleisen discuss in [5] and [6]—again, I am not an expert!

I'm not sure how useful any of that is, but take it for whatever it's worth. My overall impression is that the Shepherd on Fibers is a big step in the right direction. Congrats on the great work!

-Philip

[1]: https://www.microsoft.com/en-us/research/uploads/prod/2019/04/fork-hotos19.pdf
[2]: https://www2.ccs.neu.edu/racket/pubs/icfp99-ffkf.pdf
[3]: https://github.com/racket/racket/tree/master/racket/src/rktio
[4]: https://docs.racket-lang.org/reference/subprocess.html
[5]: https://groups.google.com/g/racket-users/c/AAeEp_llxaY/m/uRviksPGAgAJ
[6]: https://groups.google.com/g/racket-dev/c/PyetqHSjtAA/m/slBdazdqAwAJ
[7]: https://github.com/wingo/fibers/wiki/Manual#32-barriers



reply via email to

[Prev in Thread] Current Thread [Next in Thread]