epsilon-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [epsilon-devel] Questions about Epsilon and call-stack introspection


From: Basile Starynkevitch
Subject: Re: [epsilon-devel] Questions about Epsilon and call-stack introspection.
Date: Sat, 04 Jan 2014 17:41:34 +0100

On Sat, 2014-01-04 at 16:59 +0100, Luca Saiu wrote:
> Hello Basile.  Sorry for answering with a little delay, but I didn't
> want to be too tired when writing this.
> 
> As I told you, I was planning to write a blog post on this, but I'm very
> happy that now we're using the mailing list instead.  This way people
> can see that development is alive, and I can write informally without
> being slowed down by my unproductive perfectionism.

The big advantage of the list is that others (me included) can
contribute to it. AFAIK your blog is written by you only (this is ok for
me, but I won't call that a "blog").
> 
> On 2014-01-02 at 12:41, Basile Starynkevitch wrote:
> 
> > Happy new 2014 year to everyone.
> 
> Thanks.  Same on you.
> 
> > First, I am impatiently waiting for Epsilon to get totally rid of Guile.
> 
> I'll try to work on it this evening and tomorrow as well.
> 
> 
> The way I describe things below may or may not be accurate with respect
> to who proposed what for the first time; let's say it's just the way I
> organize things in my brain, when I ponder about these questions.
> Basile has known epsilon for a long time, and he's always given me
> friendly feedback.  In particular we spoke about all these questions at
> some length at a seminar of mine about epsilon last October, and it's
> not unlikely that he/we touched the same topics before as well.
> 
> Now that I think of it, Basile also advocated the idea of running epsilon
> on the bare metal, possibly before I did.  (However I've been planning
> to run epsilon on a small machine without a C runtime at least since
> March 2013; you can find a Usenet post of mine, written in my usual
> pompous style).

Well, IIRC, by "bare metal" I was thinking of generating a static ELF
for Linux, using raw syscalls (i.e. have some Epsilon0 primitives
translated to SYSENTER or SYSCALL machine instruction). This is not
exactly what I call bare metal (bare metal would imply in kernel mode,
loaded by GRUB; I'm thinking of Linux user-mode code statically linked
ELF...); targetting Linux user-mode is a lot easier than targetting
GRUB-loaded supervisor-mode code. (In particular, a hello-world program
just need to write(2) to stdout i.e. file descriptor 1; in
supervisor-mode you need to go to BIOS or know about VGA...).

> 
> > Then, I really think that Epsilon should be a language in which one can
> > code its own (e.g. copying generational Cheney like) garbage collector.
> > Very few languages have this ability (and it is a shame, see however
> > Scheme48 and its PreScheme). To make that possible, one has to have
> > primitive to inspect and update call frames on the stack.
> 
> Let me make this proposal a little more precise, the way I see it.
> 
> I start with "the minor stack change": primitives should have access to
> the stack in order to read and update boxed objects saved on the stack.
> 
> Pros: the ability of writing memory managers in epsilon itself, making
>       the "external" C runtime even smaller.  Particularly good for
>       small embedded systems, which indeed I would like to support;
> Cons: a relatively small semantic change.
> 
> The minor change entails a very small change to the semantics of
> epsilon0: namely, primitives should receive the current stack as an
> additional parameter.  The new parameter shall only be visible at the
> semantic level, and shall make the stack similar to the global state
> (uppercase gamma in the semantics of Chapter 2) in this respect;
> however, primitives shall still *not* return an updated stack, as part
> of the minor stack change.
> 
> Of course most primitives would not access the stack, to keep things
> clear; we might have just one new primitive taking advantage of the new
> provision.  I propose this:
> 
> New primitive:
> * control-stack:get
>   Zero parameters, two results: the current-thread stack as a buffer,
>   and the current stack height.
> 
> In practice it's unacceptable for efficiency reasons to require that all
> values be always stored on the stack; for this reason we need a new
> epsilon0 language form to selectively force that when required.
> Unfortunately it can't be simply a primitive, because it has effects on
> how the surrounding code can be compiled.
> 
> For efficiency's sake, control-stack:get shall return *the*
> current-thread stack, which shall be an ordinary epsilon buffer (at this
> time it's not), without making a copy of it.  Of course the result may
> then be copied, if needed.
> 
> New epsilon0 language form:
> * e0:call-with-saved-temporaries
>   One or more parameters: an epsilon0 procedure, and its actuals;
>   As many results as the procedure.
>   Formal semantics: identical to e0:call as per Chapter 2.
> 
> Rationale: for efficiency reasons, in compiled code as many temporary
> values as possible as kept in machine registers rather than on the
> stack; in a sophisticated implementation this may also true for some
> temporaries belonging to caller frames.  This complication is hidden
> from the semantics but is very real (at least, I will take advantage of
> this in the future).  Reality may actually be even more complicated:
> some memory-management systems, including most copying collectors,
> enforce some representation constraints on values: it's what I call
> "boxedness tags" in Chapter 3.  Now, it would be very nice to be able to
> temporarily break these constraints in an implementation, and use a more
> efficient representation in some places where efficiency matters (again,
> I'd like to take advantage of this in the future).

You could perhaps have a slightly different design. Knowing which local
variables are accessible at some given Epsilon0 source code handle is
trivial (your compiler already knows that). You could have the
convention that every such variable has a fixed offset on the stack (at
the given source code handle). Then you could just have a Epsilon0
construct or perhaps primitive (e0:spill-to-stack locvar0 locvar1 ...)
which would force the compiler to spill the local variables locvar0
locvar1 ... to the stack.

Of course you'll still need a primitive to fetch the n-th local from
frame #p and a metaprimitive to query the offset of some local
variable. 
> 
> Implementation note: e0:call-with-saved-temporaries shall:
>   - evaluate its actual parameters into new temporaries;
>   - save all temporaries which are currently in registers to the stack,
>     adding boxedness tags where absent if required by the implementation;
>   - call the procedure;
>   - at procedure return, restore still-alive temporaries from the stack
>     into registers, removing again boxedness tags where required;
>   - return results.
> 
> This design permits an efficient implementation, while still providing
> the user with a clean view of the stack when needed.
> 
> Simply writing into the buffer with buffer:set! updates the saved
> values.  It should be done within the extent of the procedure called by
> e0:call-with-saved-temporaries, in order to ensure that temporaries held
> in registers are also affected.
> 
> Of course this is highly unsafe in general.  But as usual, I reply that
> epsilon0 is not intended for humans, and epsilon1 is not intended for
> wimps.
> 
> At this point we can add stack introspection as well, which is one of
> Basile's main concerns: just knowing which stack frame belongs to which
> procedure is not a terribly delicate issue.  It can be done with one
> procedure:
> 
> New procedure:
> * control-stack:introspect
>   Two parameters: a stack, a stack height;
>   one result: a vector holding <program point, initial frame height,
>   final frame height> triples.
> 
> Somewhere in the global state, a compiler-generated table shall map each
> program point into a procedure name, a source locus and an alist mapping
> (at least) each alive local variable name into its frame offset.
> 
> control-stack:introspect might either be a trivial wrapper for a
> procedure with the same semantics, or it might be built on top of some
> lower-level primitives, in any way convenient for an efficient
> implementation.  (For example Basile knows how the ocamlopt runtime uses
> a global hash mapping each function return address saved on the stack to
> something similar to what I call program point).

I would believe that you'll need to have some operator (or primitives)
mapping the machine program counter (i.e. in each call frame) into
epsilon0 handles (which might be non-trivial if you do loop-unrolling at
the epsilon0 level, but I thought you'll only do that at epsilon1).
> 
> If I recall correctly, Basile proposed the minor stack change above
> (I've just made explicit the required design changes, as I see them)
> plus call stack introspection.  The primitive semantics change is no big
> deal.  On the other hand epsilon0 is a very minimalistic language so I'm
> not very happy at the idea of adding a new form to it; however this
> really seems necessary.  So be it.

Well, I believe that adding such a form is what would make your Epsilon0
attractive...

> 
> > Also, being able to update the call stack, including by inserting new
> > (or removing existing) call frames in the middle of it, would be very
> > interesting, requires deep support from the compiler (and the language)
> > and would enable an interesting implementation of continuations and
> > call/cc. Current Epsilon1's approach (whole program CPS transformation)
> > is not really funny (and not very original neither).
> 
> However, I responded to Basile, if we go to the trouble of letting the
> user explicitly write on the stack, I'd really like to do more: let's
> have the user update call frames as well.  This is another way of
> implementing continuations, different from a transform, possibly simpler
> and more efficient.  It can also be used for coroutines, and green
> threads/fibers, which is useful.  Or simply exceptions --- the one most
> important language feature missing from epsilon1, in my opinion ---
> again in an efficient manner.  Let's call this hypothetical "the major
> stack change".

Well, updating a call frame is "removing" the old frame and "inserting"
the new one... But IIUC you cannot define "insert a frame in the middle
of the call stack" from the "change call frame" primitive...

> 
> If it was Basile who first came out with the major stack change idea
> instead of me, I apologize to him once more; again, I've been pondering
> on this for quite some time, and the origin of each idea now is a little
> muddled in my head.

No need to apologize, and it is probably not my idea neither (even if in
my own old 1990 PhD thesis I had similar ideas, calling "contexts" then
what is today known as call frames in a continuation-approach of the
call stack). Danvy, Schank, Lenat or Pitrat probably thought about this
before us... (and also the late Jean Marc Fouet with his Gosseyn
machine)

> 
> Let me draft a design of the major stack change.
> 
> Pros: the ability of writing non-local control features, even in epsilon0;
>       All the pros of the minor stack change.
> Cons: a *big* semantic change in epsilon0, plus the one from the minor
>       stack change.
> 
> This entails a further change in the semantics of primitives: now
> primitives should also be able to change the current stack: in the new
> primitive semantics, the stack becomes both a parameter and a result.
> Of course this new updated stack coming out of a primitive shall become
> the stack of the primitive caller.  This change has profound
> implications on the semantics: once a procedure call terminates, it's
> not necessarily the case any more that control goes back to its caller.
> For example the proofs (and the results) in Chapter 4 break.

Well, I thought that your primitives had the (defining & limiting)
properties that they did not alter the call stack! Anything that touches
the stack was in my eyes a language operation, not a primitive...
> 
> I'd have only one primitive perform destructive stack changes:
> 
> New primitive:
> * control-stack:set!
>   Two parameters: the new stack, its height;
>   Zero results [or possibly one boolean result, à-la-setjmp?].
> 
> control-stack:set! replaces the stack with the given one, pushed its
> zero- [or one-?] bundle result, and goes on from there.

Nice.
> 
> When this primitive is called we just forget about its caller: in
> general it will change into something completely different, starting
> from the execution of the last part of e0:call-with-saved-temporaries.
> Then, I suppose, at some point it would execute the last part of
> e0:call-with-saved-temporaries, who reified the stack in the first
> place.
[...]
> 
> Indeed.  I don't claim to understand all the implications of
> control-stack:set!, or how to use it in general.  Just like call/cc, I'm
> sure that its details are deeper than they look.  Simple in a way, but
> very scary.


You really should contact Jacques Pitrat (to other readers, see
http://bootstrappingartificialintelligence.fr/ and
http://jacques.pitrat.pagesperso-orange.fr/ ...) he knows even better
than I do why writing the control stack is so important! Of course, you
and him need to adapt terminology!

-- 
Basile STARYNKEVITCH         http://starynkevitch.net/Basile/
email: basile<at>starynkevitch<dot>net mobile: +33 6 8501 2359
8, rue de la Faiencerie, 92340 Bourg La Reine, France
*** opinions {are only mine, sont seulement les miennes} ***





reply via email to

[Prev in Thread] Current Thread [Next in Thread]