pika-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pika-dev] Re: willing to help


From: Tom Lord
Subject: Re: [Pika-dev] Re: willing to help
Date: Fri, 12 Dec 2003 13:17:23 -0800 (PST)


    >> Jose A. Ortega Ruiz

    >> i'm writing to offer my help, as a developer, on the Pika
    >> project, in case that you think, of course, that i can actually
    >> be of any help!

I'm quite embarassed because I lost track of your original message
before getting a chance to reply to it.

    > From: Matthew Dempsky <address@hidden>

    > We just talked on IRC about this,  [....]

Thanks, Matthew, for doing that.

    >> i've taken a cursory look at the pika's docs and source code,
    >> and would appreciate any hint about the right path for studying
    >> the latter in detail. and, of course, if you can point me to
    >> something concrete to start hacking, i'm willing to try my hand
    >> at it :)

Just by way of some orientation:

The implementation plan is always subject to change in the face of a
better idea, but currently is (in broad outline) this:


1) Get a basic "Scheme run-time system" written in C

   [This is the step that is going on now.]

   The run-time system implements the basic value types (numbers,
   pairs, vectors, characters, strings, etc.) including most of the
   primitive procedures that operate on these and a garbage collector.

   Among the primitive procedures being provided are `read' and
   `write' -- the idea being that at a later stage we can use these
   to read bytecompiled code produced by another Scheme implementation
   used for bootstrapping.

   Some procedures, particularly higher order procedures (e.g., `map')
   are being skipped for now -- simply to avoid the issues of
   environment and continuation allocation (including tail-call
   issues).


2) Implement a simple interpreted VM.

   The VM I have in mind will be a "stackless VM" (it won't use the
   C stack for interpreted procedure calls).

   It has "the usual" three ordinary registers: environment,
   continuation, and pc.

   It also has an infinite number of "tuple registers" used to pass
   parameters to a procedure (or receive them in a continuation).
   (I.e, there's a param[0] register, a param[1] register etc.   Also
   a number_of_params register, a procedure_to_call register, and a
   params_tail (a list of parameters as might be passed to `apply')
   register.

   The machine has ~10 instructions, some of which are fairly complex
   macro instructions (for example, all of the param[N] values for a
   given procedure call are loaded by a single instruction).

   The instruction set is "structured" in a sense similar to Java:
   not all possible sequences are legal programs;  it's a statically
   verifiable property of a given set of instructions whether or not
   it is legal.   In essense, certain parts of the state of the VM
   (such as the contour of the environment) are fixed for a given
   instruction in a program and statically computable.

   Although I don't expect the first-cut of this interpreter to be
   especially fast, there is one optimization I'd like to build in
   from the start:   namely that environments and continuations will:
   be allocated in a stack-like fashion; de-allocated in a stack-like
   fashion if they are not "captured" by the running program; and
   copied to the heap if and when they first captured.

   The instruction set is intended to "extensible" in the sense that
   during either interpretation or translation of VM programs, a
   statically recognizable application of a known primitive procedure
   may be treated specially.

   The instruction set is also intended to permit a "pretty fast" 
   direct interpreter for it.

   The instruction set is also intended to permit a "pretty good"
   naive JIT.

   The instruction set is also intended to permit a flow-graph,
   suitable for optimizing compilation, to be easily extractable from
   a VM program.

   Finally, the instruction set is intended to be both a very simple
   target for naive translation of Scheme programs, and a target that
   rewards certain kinds of optimization during translation of Scheme
   programs (e.g., lambda-lifting).


3) Implement a simple Scheme Compiler Targeting the VM

   The plan is to be able to run this compiler on another Scheme
   implementation, and begin to be able to load and run compiled code
   on the Pika VM.


4) Write a simple Scheme->C compiler Targeting the Run Time System

   The plan here is to be able to complete Pika to the point of being
   a stand-alone R5RS by filling in missing parts with Scheme code, 
   compiling them to C using another system, and linking up the full
   R5RS Pika.

   At this point, Pika should be able to host the simple compiler in
   (3) and the compiler from (4).


5) Add the High-level Module System

   There is quite a lot of activity going on these days among Scheme
   implementors that _may_ wind up producing a portable standard for
   a high-level module system which supports separate compilation.
   Therefore, I'm postponing trying to design one eagerly until its
   time to implement it.


6) Start adding features and writing applications.

  My personal goal is to use Pika to work on a sort-of Emacs-like
  application framework (a description of which I'll leave for some
  other message).


A few guiding philosophies that come to mind are:

  ~ interactive use (hacking a running program, emacs-style)
    should be very well supported

  ~ good debugging support for interpreted code is critical

  ~ the Pika extended Scheme should _not_ be restricted to
    facilitate compilation -- rather, it should contain subsets
    to which a compiler might be restricted.

    For example, Pika will have (almost) MIT-Scheme-like first
    class environments -- but I don't expect compilers to be able
    to compile programs that use them.


  ~ Unicode support will not be an afterthought.

  ~ Unicode support will not be "special cased".  For example, some
    systems restrict themselves to UTF-8 processing, restrict
    themselves to a subset of valid Unicode character sequences,
    restrict themselves to particular canonicalization forms.  In
    contrast, Pika will be an environment in which all of the text
    processes defined in, for example, the Unicode Consortium's
    technical report series, will find convenient and natural
    expression in terms of Pika's character-like types and string-like
    types.


-t




reply via email to

[Prev in Thread] Current Thread [Next in Thread]