Initial thoughts.

g-wrap-dev
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Initial thoughts.

From:	Rob Browning
Subject:	Initial thoughts.
Date:	Wed, 29 Oct 2003 17:27:37 -0600
User-agent:	Gnus/5.1002 (Gnus v5.10.2) Emacs/21.3 (gnu/linux)
What follows is a summary of a number of things that I've been
thinking about off and on with respect to g-wrap over the past year or
so.  Some of these items may be more reasonable than others, but I
mention them all since even the less reasonable bits may lead to a
good discussion.

Also, I encourage you to start a new thread/subject when replying to
any particular point if you like.  That might help keep the respective
discussions clearer.

As it stands, I feel like g-wrap is fairly flexible, but not all that
friendly.  So the question is, what happens next (or what should
happen next).  Bear in mind (as one backdrop for much of this) that
I'm also quite interested in the possibility of adding some form of
compilation to guile itself one of these days, so some of my thoughts
may be tinged a bit by that.

In particular:


  * with respect to shared lib issues.  I'm not at all opposed to
    trying to simplify things, or rearrange how things are currently
    designed.  However, there seems to be a bit of a minefield where
    shared libs, versioning, partitioning into separate libraries, and
    (backward) compatibility are concerned, especially when you throw
    in libraries with sub-library dependencies, and runtime,
    dlopen-style dynamic linking (all of which affect g-wrap directly
    :/).

    As an example, right now g-wrap is probably broken.  Imagine
    you're debian, and you want to package the latest g-wrap for both
    stable and unstable.  Also imagine stable has guile 1.4 and
    unstable has guile 1.6.  Two different major versions of guile
    have different library sonames, and so are not binary compatible.
    However, if you build g-wrap on both systems, you'll get two sets
    of g-wrap libraries with the same sonames, even though they're
    linked against two different, incompatible versions of libguile
    and are, by extension, incompatible themselves.  This is somewhat
    bad.  This can probably be fixed, and I've been involved in some
    fairly in-depth conversations on the issue on both guile-devel and
    debian-devel, but some of the curse might be almost as bad as the
    disease.  Here are some of the relevant discussions:

    http://mail.gnu.org/archive/html/guile-devel/2002-12/msg00074.html
    http://mail.gnu.org/archive/html/guile-devel/2002-12/msg00061.html
    http://lists.debian.org/debian-devel/2002/debian-devel-200212/msg00995.html

    Honestly, right now, off the top of my head, I don't recall where
    all this eventually ended up (i.e. what we might need to do on the
    g-wrap side, and what might still need to be done on the guile
    side).  Also, there may or may not still be some outstanding
    issues with respect to dynamic-link (via libtool dlopen) and its
    lack of a versioned "ltdl_open".

    Anyway, on both counts, I think we're probably going to want to
    look and see what the current state of affairs is before we can
    decide how we want to proceed.


  * One thing that's somewhat awkward in g-wrap right now is the way
    that you have to construct the C code that's eventually going to
    end up in the relevant wrappers.  Right now, you just return a
    tree of strings (and possibly a few "magic" symbols) and that
    string tree is eventually flattened and dumped to the generated C
    files.  In addition to being awkward to construct and read, the
    string tree approach leaves you with an output representation
    that's more or less opaque.

    One alternative I've considered (which fits in with some random
    speculations I've had wrt guile itself) is whether or not it would
    be helpful to introduce a "sexp representation for C" (CSE) to
    g-wrap.  i.e. to define a sexp grammar that allows you to easily
    represent all (or maybe just a relevant subset) of C, that's easy
    to render to C at output time, and that you can then use to
    replace code like this:

      (define (scm->c-ccg c-var scm-var typespec status-var)
        (let* ((sv scm-var)
               (wct-var wct-var-name)
               (type-check-code
                (list
                 "SCM_FALSEP(" sv ") "
                 "  || gw_wcp_is_of_type_p(" wct-var ", " sv ")"))
               (scm->c-code
                (list
                 "if(SCM_FALSEP(" sv ")) " c-var " = NULL;\n"
                 "else " c-var " = gw_wcp_get_ptr(" sv ");\n")))

          (list "if(!(" type-check-code "))" `(gw:error ,status-var type ,sv)
                "else {" scm->c-code "}\n")))

    with something perhaps like this:

      (define (scm->c-ccg c-var scm-var typespec status-var)
        `(if (not (or ("SCM_FALSEP" ,scm-var)
                      ("gw_wcp_is_of_type_p" ,wct-var ,scm-var)))
             (gw:error ,status-var ,type ,scm-var)
             (if ("SCM_FALSEP" ,scm-var)
                 (set! ,c-var NULL)
                 (set! ,c-var ("gw_wcp_get_ptr" scm-var)))))

    in truth, there would almost certainly be some tricky bits here,
    but the basic idea of a CSE representation is the main thing I was
    thinking about.  Aside from being easier to deal with when editing
    a scheme file, it also opens the possibility to manipulating the C
    code in interesting ways before output.

    Along these lines, I've actually written a simple grammar that
    should represent all of C (at least according to the ANSI C
    grammar), along with an associated renderer, but it's just a toy
    right now.

    (And of course, if guile ever had an "inline C" syntax like this,
     aside from being able to embed "fast bits of C" inside scheme
     files (presuming you're willing to compile those files), there's
     also the possibility of adding support for precise GC to that C
     code automatically, etc...)


  * On another topic, that of automatic wrapper generation, one of the
    tricky bits is how you get *reliable* API information.  GTK has
    avoided this problem by (quite nicely) providing an easily
    parsable spec.  However, in cases where such information isn't
    readily available, one thing people often consider is parsing the
    headers themselves.  I believe SWIG does this, but I've always
    been wary of that approach because unless the parser you're using
    to parse the headers is *identical* to the one you're going to
    eventually use for compilation, you can't be sure they'll
    interpret the headers the same way (using the same search paths,
    same __foo__ extensions, same defines, etc.).  After a thinking a
    bit I realized that you could alleviate much of the uncertainty by
    just requiring the user to preprocess any header before analysis,
    with the same compiler and the same options that they intend to
    eventually use during compilation.  Of course you'll still have to
    have some way to indicate *which* functions you're interested in
    wrapping, since you'll be likely to pull in a whole bunch of
    irrelevant prototypes during the preprocessing.

    While considering the above, someone suggested I might want to
    look at CIL (http://manju.cs.berkeley.edu/cil/).  I checked it out
    and talked with the author for a bit.  It sounds like it might
    well allow arbitrarily sophisticated analysis, including the
    fairly simple extraction of prototypes of interest and definitions
    of arbitrarily complex types.

    To some extent CIL ties in with my speculations about CSE above.
    If you have some way to translate from C to a syntax that's easier
    to work with (like sexps), it may be a lot easier to do many kinds
    of fancy analysis and manipulation (precise gc, redundant type
    check elimination, etc.).


  * I have also spoken to one of the people working on OpenMCL (a
    good, and now free, implementation of common lisp).  They use a
    modified version of the ffigen project to generate their C FFI.
    They can translate from C headers to a sexp C API rep that they
    then can easily manipulate from lisp.  I believe the OpenMCL tool
    differs from the older fffigen in large part by using gcc code for
    the parsing instead of lcc.

      http://openmcl.clozure.com/Doc/interface-translation.html
      http://www.ccs.neu.edu/home/lth/ffigen/

    Some notable points about my conversation with with the openmcl
    developer:

      - they would be very interested in working with us on breaking
        out their parser as a standalone .h->sexp converter and
        adjusting the sexp API syntax if necessary to be both common
        lisp and scheme friendly.  He may also have some interest from
        other common lisp groups.

      - it would be nice to port their parser to gcc-3.3; it's
        currently based on gcc-2.95 code (or was back when I spoke to
        him).

      - he claims the biggest problem in most cases is (as I suspected
        above) trying to make sure you get the right set of compiler
        options (i.e. -DFOO, -Ibar, etc.) when generating the API
        spec.  If you don't, you may generate a bad spec.
  
    If we did decide we were interested in working on this project,
    I've wondered whether or not CIL (above) might be able to provide
    a more comprehensive foundation for the C->sexp translation, but I
    haven't had time to investigate any further.

  * I also looked at and speculated a bit about using libffi a while
    ago, but it sounds like Andreas is way ahead of me there :>

  * Here's one other page of at least some tangential interest if
    you've never looked at it before:
    http://gcc.gnu.org/frontends.html.  In particular, the ksi project
    was fun to play around with.  It's a fairly thin front end for gcc
    that lets you write gcc parse trees in a lisp-like syntax as your
    code.  Given gcc's internal tight coupling to the C ABI it's not
    all that clear ksi would be any better as a target backend for a
    scheme compiler than just plain C, but it's still somewhat
    interesting.

-- 
Rob Browning
rlb @defaultvalue.org and @debian.org; previously @cs.utexas.edu
GPG starting 2002-11-03 = 14DD 432F AE39 534D B592  F9A0 25C8 D377 8C7E 73A4
[Prev in Thread]
Current Thread
[Next in Thread]
Initial thoughts., Rob Browning <=
- Re: Initial thoughts., Andy Wingo, 2003/10/30
  - Re: Initial thoughts., Rob Browning, 2003/10/30
Prev by Date: Runtime library structure
Next by Date: Re: Initial thoughts.
Previous by thread: Runtime library structure
Next by thread: Re: Initial thoughts.
Index(es):
- Date
- Thread