Re: IPC etc. (was: Future Direction of GNU Hurd?)

On Wed, Mar 24, 2021 at 3:12 PM William ML Leslie <william.leslie.ttg@gmail.com> wrote:

> In real applications, large messages can be very important. In microkernels, it is not unusual for a large message payload to exceed the total available kernel virtual memory. KeyKOS and EROS behaved as William proposes. It was a huge pain in the ass that led to significantly increased message traffic. Williams proposal also means that the duration of a kernel transaction cannot be bounded. Solving this was one of the big advances in Coyotos.
>

Coyotos has a 64kb limit on the size of indirect strings, and these
get truncated if any of the pages are not prepared. This gets much
more irritating in the async case, so being able to asynchronously say
"have these pages loaded, and then send this message" becomes
essential.

I think the "have pages loaded" statement is a little confusing. I do not now remember if the receive area is pre-probed or if the send is retried when a receive page fault occurs. Either way, the actual requirement is that the receive page must exist and be writable. This is a requirement in any copying IPC.

Concerning longer strings, my memory may no longer be correct and I don't have time to look at the code right now. I remember that we went back and forth on this - the Coyotos IPC mechanism already has a small state machine because of scheduler activations, so it would be possible to add states for "expecting more string payload". I'm going from memory here, but I think the reason we abandoned that is that it creates a denial of service issue. The receiver must temporarily be in an exclusive state with the sender. A bad sender could exploit this by sending the initial 64K and then failing to continue the protocol. A timeout would be required to handle this case. 64K was as high as I was willing to go as an non-preemptible operation, and it's probably too long.

In Coyotos, for strings longer than this, a memory region named by a capability should be used.

>> Particularly important with async IPC, where loads of messages may be
>> touching all sorts of pages.
>
> I wonder if you are mis-defining reliable. The question isn't whether the message goes in one piece. The question is whether the message goes. Coyotos solves this with a restartable string transfer approach.
>

I'll double check the source, it's possible I've misunderstood the new
string transfer mechanism.

You were correct.

>> Another is a GC heap walk. Most operating systems get very confused
>> by GC and get to the point where they make no progress, because it
>> sees that pages were recently touched and so decides shouldn't be
>> paged out. Having the GC declare where its fingers are and where they
>> are headed as the requirement for residency gives the power back to
>> the process.
>
>
> Modern GCs use things like madvise() to alleviate this.
>

For which we have no interface in Coyotos, yet.

Have a look at cap_Page.c and the OC_coyotos_AddressSpace_erase operation. It zeros the page. Pages that are known to be zero were not checkpointed, and this would behave the same way in Coyotos when we implemented checkpointing.

The current implementation of Coyotos is memory-only, so releasing the frame doesn't make sense, but it *does* seem to me, looking now, that a parameter could be added indicating that the page should be aggressively released from the in-memory working pool.

The necessary bottom-up recursion would need to be implemented by invoking the capabilities from user level - there is no good way to do a large-scale zero from the kernel level (I can see how to do it, but it would need to become part of the kernel background collector).

I think the right way to do this would be to add a new type of virtual copy space that implements this as a service.

>> CapROS (and presumably EROS, too) have a set of non-persistent
>> applications that most of the persistent processes depend on. It
>> feels a little like how GNU shepherd and systemd pre-open sockets and
>> pretend the application is already available.
>
>
> These were for drivers only. They never worked to my satisfaction.
>

I'm not even sure what you settled on here. FWIR we went back and
forth on orthogonal persistence a bunch of times.

It stopped being an issue in Coyotos when we focused on the initial, memory-only implementation.

Assuming you can reinstate the capabilities (which the Endpoint object would permit), there is no problem with a persistent process calling a non-persistent process.

The problem comes when the driver wants to send you a message back after a restart has occurred. If the caller is in a "receive wait" state, it will never get woken up. Since the driver has been re-created, it no longer holds the reply cap.

Just thinking out loud, I can now see two ways this could be handled:

Make the driver Endpoint objects in both directions persistent, and have a registry and protocol for re-establishing the process capability that points to the driver process (on the send side) and the reply capability that points to the receiver (on the driver side). The driver side of this is the part that is the nuisance.
Alternatively, register the Endpoints as before, and have a restart agent that uses them to perform a SEND to each direct driver client advising that the driver has been restarted and a connection should be re-built.

I'm sure neither of these is quite right, but I think either one could be made to work with some tinkering.

Jonathan

From:	Jonathan S. Shapiro
Subject:	Re: IPC etc. (was: Future Direction of GNU Hurd?)
Date:	Fri, 26 Mar 2021 14:15:51 -0700