[MIT-Scheme-devel] callbacks proposal

mit-scheme-devel
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[MIT-Scheme-devel] callbacks proposal

From:	Matt Birkholz
Subject:	[MIT-Scheme-devel] callbacks proposal
Date:	Fri, 14 Oct 2005 20:37:03 -0000
> From: address@hidden (Matt Birkholz)
> Date: Sun, 9 Oct 2005 11:54:58 -0700
> 
> [...]
> The siren of synchronous callbacks calls to me.

Here's my proposal for doing "synchronous callbacks" -- running Scheme
callback procedures while the toolkit (C-CALL primitive or GMainLoop)
waits.  Asynchronous callbacks (just enqueueing C data during the
callback, and dequeueing it as Scheme data later via normal
primitives) are fine for girly men. :)

Just for context, here's my whole FFI User Guide thus far.  Skip
to the good bits.  Search for "callback".


* Foreign Function Interface (FFI)

Calls C functions (via callout trampolines) and manipulates C data
structures (via compile-time offsets).  Also generates callback
trampolines.

** Examples of Extended Syntax

    (C-include "prhello" "prhello-cdecl")
    (C-generate-trampolines "#include <gtk/gtk.h>")

    (C-> alien "GdkEvent any type") => integer
    (C->= alien "GdkEvent any type" value)

    (C-enum "GdkEventType GDK_MAP") => 14
    (C-enum "GdkEventType" 14) => |GDK_MAP|

    (C-call "callout_trampoline" args...) => integer or float or unspecific

    (C-callback "callback_trampoline")
    (C-callback (lambda args ...))

** Overview

A Scheme-like declaration of C types and functions is loaded by the
C-include syntax.

        (C-include "prhello" "prhello-cdecl")

The loaded file might look like this.

        (extern (* GtkWidget)                   ;gtk+-2.4.0/gtk/gtkwindow.h
                gtk_window_new
                (type GtkWindowType))

        (typedef GtkWindowType                  ;gtk+-2.4.0/gtk/gtkenums.h
                 (enum
                  (GTK_WINDOW_TOPLEVEL)
                  (GTK_WINDOW_POPUP)))

There are several limitations on the C types that can be declared in
this file.  Struct, union, enum and pointer types are allowed, but
bit-field members are not supported.  Trampolines are generated only
for functions of the primitive types and pointers.  Pointers to
structs or unions are supported, but struct or union parameters or
return types are NOT.

The C-generate-trampolines syntax creates e.g. a prhello.c file
containing a trampoline for each declared C function.  The .c file
must be compiled into a dynamically-loadable library or statically
linked into the microcode.  A script to generate the trampolines might
look like this:

        (C-include "prhello" "prhello-cdecl")
        (C-generate-trampolines)

The C-call syntax is used to call a callout trampoline.  Arguments to
the trampoline can be ONLY integers, floats, strings or aliens (see
Alien Data).

        (C-include "prhello" "prhello-cdecl")
        (let ((alien (make-alien '|GtkWidget|)))
          (C-call "gtk_window_new" alien type)
          (if (alien-null? alien) (error "could not open new window"))
          alien)

As shown above, a C function returning a pointer type must be called
with an extra alien argument.  The function's return value will
clobber the alien's memory address.

The C-include syntax reads C type declarations AT SYNTAX-TIME so that
subsequent syntax can expand into Scheme constants.  Some examples:

        (C-> alien "GdkRectangle y")
        ==>
        (c-peek-int alien 4)

        (C->= alien "GdkRectangle width" 0)
        ==>
        (c-poke-int alien 8 0)

        (C-enum "GtkWindowType GTK_WINDOW_POPUP")
        ==>
        1

The C-callback syntax is used both to pass the address of a callback
trampoline and to register a Scheme callback procedure.  Here is how
this syntax might be used to pass a callback trampoline address and ID
to a C function:

        (C-call "g_signal_connect" window "delete_event" 
                (C-callback "delete_event")     ;e.g. &Scm_delete_event
                (C-callback                     ;e.g. 314
                  (lambda (window event)
                    (C-call "gtk_widget_destroy" window)
                    0)))

The delete_event callback trampoline would have been defined like
this:

        (callback gint
                  delete_event
                  (window (* GtkWidget))
                  (event (* GdkEventAny))
                  (ID gpointer))

The reserved parameter name "ID" identifies the callback ID argument.
The parameter name "CALLBACK" is also reserved, declaring the
corresponding argument to be an alien-function whose address is the
function pointer required by the C function.  The declaration of
g_signal_connect looks like this:

        (extern void
                g_signal_connect
                (object (* GtkObject))
                (name (* gchar))
                (CALLBACK GtkSignalFunc)
                (ID gpointer))

The integer ID and CALLBACK entry address are cast to the declared
types of these parameters to keep C's type checking happy.

** C Declarations

The C-include syntax loads C type and function declarations at syntax
time, and binds the identifier C-INCLUDES in its syntax/load-time
environment.  A C-include form should have two string or symbol
subforms.

        (C-include "library name" "mumble-cdecl")

The "library name" is provided to the system linker when looking up a
C function.  The "mumble-cdecl" specifies the .scm file to load.  This
file and any included files are read with a case-sensitive reader.
Each top-level form must look like one of these:

        (include "filename")
        (typedef Name TYPE)
        (struct Name (member1 TYPE)...)
        (union Name (member1 TYPE)...)
        (enum Name (member1 [VALUE])...)
        (extern RETURN-TYPE Name (param1 TYPE)...)
        (callback RETURN-TYPE Name (param1 TYPE)...)

[VALUE] is optional, and can be any of:

        INTEGER
        (<< 1 INTEGER), which equals 2^INTEGER.

TYPE and RETURN-TYPE can be any of the following.

        NAME
        (* TARGET-TYPE)
        (struct NAME)
        (union NAME)
        (enum NAME)
        (struct (MEMBER TYPE)...)
        (union (MEMBER TYPE)...)
        (enum (MEMBER [VALUE])...)

RETURN-TYPE and TARGET-TYPE can also be the symbol VOID.

Types do not have to be declared before they are referenced.  A struct
type can contain a pointer type pointing to itself.

*** Similarity to C Syntax

The syntax of the C declaration file was intended to be a simple,
Schemely reflection of C's syntax.  The declarations are read from a
separate file so that the identifiers can be case-sensitive.

For example, the C declaration

    typedef struct _GdkEvent GdkEvent

translates into

    (typedef GdkEvent (struct _GdkEvent))

and a struct declaration, e.g.

    struct _GdkEventAny
    {
      GdkEventType type;
      GdkWindow *window;
      gint8 send_event;
    };

translates into

    (struct _GdkEventAny
      (type       GdkEventType)
      (window     (* GdkWindow))
      (send_event gint8))

** Alien Data

A C data structure is represented by an alien containing the data
structure's memory address.  New primitives can read a char, int,
float, double, or pointer at a small (fixnum) offset from an alien's
memory address, and return a Scheme fixnum, bignum, flonum, or string.
The C-> and C->= syntax use these primitives to get and put C struct
members using compile-time constant offsets.

        (C-> alien "GdkRectangle y")
        ==>
        (c-peek-int alien 4)

        (C->= alien "GdkRectangle width" 0)
        ==>
        (c-poke-int alien 8 0)

** Managed Aliens

Scheme's aliens can represent memory and other system resource
allocations that should be freed when Scheme has no further use for
them.  These aliens and their "free" procedures are kept in a weak
table, so that their resources can be freed when they are garbage
collected.  They are marked as though already freed when restored in
a band.

The malloc procedure returns a managed alien that will automatically
free the malloced memory when it is garbage collected (if not before).

        (free (malloc '|GdkRectangle|))

** Alien Function Implementation

The C-call syntax applies call-alien to an alien-function structure,
which caches a trampoline's entry point.  It also contains all the
information needed to load the trampoline on demand (its name and
library).  Once call-alien has an entry address, it can invoke the
trampoline (via the C-CALL primitive).  The trampoline gets its
arguments off the Scheme stack, converts them to C values, calls the C
function, conses a result, and returns it to Scheme.  As a special
case, a function returning a pointer type expects an extra first
argument -- an alien to be updated with the returned pointer address.
(This eliminates lots of consing AND provides for safe handling of
malloced memory and other OS resources that must be freed.)

* Callback Trampolines

A callback function definition is like a normal extern, but produces
trampoline code that works in reverse.  The trampoline copies the
callback arguments into Scheme ints, floats, strings and aliens, then
applies the associated Scheme closure.  The entire threaded Scheme
runtime might run, esp. if the callback handler blocks on an i/o port
-- yielding the current thread.  Other threads can run, but timer
interrupts should not yield to the toolkit (via calls to RETURN-TO-C).
The toolkit is stuck until Scheme continues back to the trampoline.
(Still, MY callbacks will be coded to run uninterrupted, without
engaging in i/o, and without signaling any errors. :-) The trampoline
will convert the Scheme value of the callback handler (e.g. an alien),
returning its C type (e.g., (* GtkWidget)).

The Scheme callback handler is known by its fixnum, which it gets by
registering.  Once registered, the closure is pinned -- the procedure
and its environment cannot be garbage collected.  The closure will
remain available to handle calls from the C world until it is
de-registered.  A band restore releases all registered callbacks.

* Callback Implementation

Callback trampolines edit the interpreter's continuation (stack),
pushing two micro-continuations: RC_END_OF_COMPUTATION, then
RC_CALLBACK_APPLY.  When the trampoline re-enters the interpreter
(recursively), it will cons callback arguments, apply the handler, and
halt, returning to the trampoline.  The trampoline uses the
interpreter's val_register to construct the appropriate C value and
returns to the toolkit.

The trampoline must re-enter the interpreter before it conses callback
arguments.  If the construction of a callback argument caused a GC
abort, the interpreter would throw (longjmp) to an earlier state of
the interpreter (if any).  The trampoline needs to re-enter the
interpreter (setjmp) and have it run a bit of (restartable) C code to
cons arguments (with possible longjmps), then a bit of a Scheme apply
(more longjmps), some value validation (possible error abort), and
finally a halt.  Outside the interpreter, the trampoline can translate
the Scheme val_register (no GCs) and return a value to C.

The bit of C code that must run inside a recursive call to Interpret
is generated as the second part of the two part callback trampoline.
The first part is a C function with the signature required by the
toolkit.  It does little more than push the C arguments and an entry
point onto the obstack (like a C-flavored stack environment :-).  The
entry point on the obstack is the address of the second part of the
trampoline.  The first part does all of this without GCing, then calls
Interpret().

The interpreter, via pop_return, dispatches off the RC_CALLBACK_APPLY
return code, reads the entry point off the obstack and calls it -- the
second part of the callback trampoline.  The second part finds the
callback handler in the fixed-objects array, builds an application
frame for it (including a list of callback arguments), pushes an
RC_INTERNAL_APPLY and continues to pop_return.  The list of arguments
is carefully constructed to keep it anchored (made reachable to the
GC) in the application frame.  The RC_INTERNAL_APPLY continues to the
RC_END_OF_COMPUTATION and Interpret() returns to the trampoline.

** Callouts with Callbacks

It is assumed callbacks might want to happen during a callout.  The
C-CALL primitive invokes a callout trampoline that converts Scheme
arguments to C arguments.  The process requires no GC, but may abort
with errors, so it looks like a normal primitive EXCEPT that at the
point it calls the toolkit it is no longer restartable.  And it needs
to seal the stack against a possible GC in a callback.  When the
toolkit returns, the trampoline may find that the original arguments
(i.e. aliens that need to be updated) may have been moved by the GC.

The primitive continuations of prmcon.[hc] allow a primitive to be
split into two restartable pieces, with GCs possible between them.
The first part may be restarted, like a normal primitive (by GC
interrupts or errors/traps), UNTIL it calls suspend_primitive to seal
the stack with a RC_PRIMITIVE_CONTINUE micro-continuation (return code
and associated stack frame(s)).  From that point until the end of the
first part, the primitive should not GC.  Any aborts will cause the
second part of the primitive to run.

The suspend_primitive function takes an array of pointers to Scheme
objects (i.e. a stack-allocated array of temporary Scheme variables)
and includes them in a frame of the RC_PRIMITIVE_CONTINUE
micro-continuation.  The second part of the primitive gets a copy of
this array (in the heap!?) when it (re)starts, and can access the
original arguments to the primitive too.

For C-CALL, the first part of the primitive would validate the Scheme
arguments, then suspend itself.  With the stack sealed, it can call
the C function even if this may provoke a GC (during a callback).
When the C function returns, the primitive places the resulting C
value on the obstack.  It cannot cons results or a GC interrupt might
restart the entire first part of the primitive.

The second part of the C-CALL primitive must (re)get its arguments off
the stack, and it must get the C result off the obstack.  Eventually,
it can update aliens (with "OUT" arguments), cons Scheme objects like
bignums and return like a normal primitive.  The second part does not
call the C function, so it can restart without re-calling the C
function.  It can handle GCs and other interrupts, signal errors, etc.
A thread switch, however, in the middle of a C-CALL, may cause the
interpreter to continue with the second part of an earlier C-CALL
primitive application, which would use the later result as if it was
the earlier...
[Prev in Thread]
Current Thread
[Next in Thread]
[MIT-Scheme-devel] callbacks proposal, Matt Birkholz <=
Prev by Date: [MIT-Scheme-devel] Re: dipolar elucidation
Previous by thread: [MIT-Scheme-devel] Re: dipolar elucidation
Index(es):
- Date
- Thread