pika-dev
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Pika-dev] Problem with Pika's macro expander?


From: Tom Lord
Subject: Re: [Pika-dev] Problem with Pika's macro expander?
Date: Tue, 24 Feb 2004 10:03:37 -0800 (PST)



    > From: Matthew Dempsky <address@hidden>

    > I've been working on Pika's macro expander (as specified in Tom's
    > notes) over the past few days, and I think there may be an issue in
    > how it handles hygiene in macros similar to the R5RS definition of
    > LETREC [....]

For the record: you're looking at the DEFINE-SYTNAX definition of
LETREC given in R5RS ("Derived Expression Types").


    > LETREC first generates a list of temporary identifiers [....]

    > However, the hygiene renaming step is not until later [....]

    > But realize, at this point we have a list of bindings of TEMP in the
    > expander's environment -- in the reference implementation I'm working
    > on, BINDING-OF returns the actual binding (thus equivalent to the
    > level of EQ? while Tom only specifies to the level of EQV?), so once
    > whatever form is responsible for doing the renaming is reached (in my
    > case LAMBDA), it has a list of EQ? bindings for the formals so there's
    > no way for it to correctly replace the bindings with symbols in the
    > rest of the form.

Your diagnosis (that hygiene renaming takes place too late) is right.
I think that the EQ?/EQV?-ness question is a red herring.

    > Maybe if I added some sort of BINDING-REFERENCE type that BINDING-OF
    > actually returns (by creating a new one refering to the actualy
    > BINDING), and then these would be non-EQ? and differentiable and the
    > exposed function EQV? would check if the references refer to EQ?
    > bindings... this just seems like an `icky' solution.

No, I think that's close to right.   You don't need to change what
BINDING-OF returns but you do want to create a new HYGENIC-BINDING-OF
that returns a different (and newly constructed) binding.

Let me just make sure I understand correctly the issue.  Let's suppose
we want to (using the R5RS definition but Pika's BINDING-OF trick)
expand this:

   (letrec ((var1 init1)
            (var2 init2))
       body)

We'll go through the steps:

   =>

   (letrec "gtn"
     (var1 var2)                        ; list of variables bound by letrec
     ()                                 ; temp vars for each of those
     ((var1 init1) (var2 init2))        ; the actual init forms
     body)                              ; the body

   =>

  (letrec "gtn"
    (var2)
    ([newtmp])                          ; one new temp var
    ((var1 init1) (var2 init2))
    body)
  ;
  ; By [newtmp] I mean that the expansion contains a binding object
  ; whose name is newtmp.   It reifies the binding of newtmp in the
  ; context of the definition of LETREC

  =>

  (letrec "gtn"
    ()
    ([newtmp] [newtmp])                 ; both temp names created
    ((var1 init1) (var2 init2))
    body)

  =>

  (let ((var1 <undefined>)              ; the "reduction to LET" step
        (var2 <undefined>))
    (let (([newtmp] init1)
          ([newtmp] init2))
      (set! var1 [newtmp])
      (set! var2 [newtmp])
      body))

And, OOPS, since we used [newtmp] twice, both VAR1 and VAR2 will be 
given the value of INIT2.

This happens because this pattern/template from the definition of LETREC:

        ((letrec "gtn" 
           (x y ...)
           (temp ...)
           ((var1 init1) ...)
           body ...)

         (letrec "gtn"
           (y ...)
           (newtemp temp ...)
           ((var1 init1) ...)
           body ...))

expands (in essense -- I've oversimplified it) to this transformer:

        (let ((y...         (cdr (list-ref exp 2)))
              (temp...      (cdr (list-ref exp 3)))
              (var1-init... (list-ref exp 4))
              (body...      (cddddr (cdr exp)))
              (_newtemp_    (binding-of newtemp)))

          `(letrec "gtn" 
              ,y... 
              ,temp...
              ,var1-init...
              ,@ body...))

and, since `newtmp' is not closed over anywhere in the definition of
the transformer, the expression:

        (binding-of newtemp)

has an EQV?-sense constant value (the top-level binding of NEWTMP in
effect at the point of the DEFINE-SYNTAX defining LETREC).

Later on, this subexpression of the expansion of LETREC:


    (let (([newtmp] init1)
          ([newtmp] init2))
      (set! var1 [newtmp])
      (set! var2 [newtmp])
      body)

expands (in effect) to:

    ((lambda ([newtmp] [newtmp]) 
       (set! var1 [newtmp])
       (set! var2 [newtmp])
       body)
     init1
     init2)

which, after hygiene-renaming, comes out to:

    ((lambda (GENSYM:1 GENSYM:2) 
       (set! var1 GENSYM:2)
       (set! var2 GENSYM:2)
       body)
     init1
     init2)


If that LET expression were slightly different, say:

    (let (([newtmp-a] init1)
          ([newtmp-b] init2))
      (set! var1 [newtmp-a])
      (set! var2 [newtmp-b])
      body)

it would expand to:

    ((lambda ([newtmp-a] [newtmp-b]) 
       (set! var1 [newtmp-a])
       (set! var2 [newtmp-b])
       body)
     init1
     init2)

which would get hygiene-renamed to:

    ((lambda (GENSYM:1 GENSYM:2) 
       (set! var1 GENSYM:1)
       (set! var2 GENSYM:2)
       body)
     init1
     init2)

and we'd be perfectly happy.

I think we can solve this by performing the hygiene-renaming early --
in the transformer where the possibly-binding-use of NEWTMP is
created.

Currently, the notes specify a binding:

     +----------------------------------------+--------+
     |  native      |   import      |   uses  |  name  |
     |  definition  |   definition  |   list  |   |    |
     +-----|---------------|-------------|----+---|----+
           |               |             |        V
           V               V             |       a symbol
      a locale          a binding        V       or alpha-id
      or transformer    or #f           a list
      or symbol                         of bindings
      or alpha id
      or #f          

I suggest changing that to:

     +----------------------------------------+--------+
     |  native      |   import      |   uses  |  name  |
     |  definition  |   definition  |   list  |   |    |
     +-----|---------------|-------------|----+---|----+
           |               |             |        V
           V               V             |       a symbol
       a locale           ...            V       or alpha-id
       or transformer                   ...      or pair of symbols
       or symbol                                 ^^^^^^^^^^^^^^^^^^
       or alpha id    
       or #f
       or a binding
       ^^^^^^^^^^^^

and add:

   + (hygenic-binding-of id)    ; syntax

     (define-syntax hygenic-binding-of
       (syntax-rules ()
         ((hygenic-binding-of id)
          (let ((b1 (binding-of id))
                (b2 (make-binding (cons id (generate-unique-identifier id)))))
            (set-binding-native-definition! b2 b1)
            b2))))


So that the pattern/template from LETREC

        ((letrec "gtn" 
           (x y ...)
           (temp ...)
           ((var1 init1) ...)
           body ...)

         (letrec "gtn"
           (y ...)
           (newtemp temp ...)
           ((var1 init1) ...)
           body ...))

can be translated to a transformer as:

        (let ((y...         (cdr (list-ref exp 2)))
              (temp...      (cdr (list-ref exp 3)))
              (var1-init... (list-ref exp 4))
              (body...      (cddddr (cdr exp)))
              (_newtemp_    (hygenic-binding-of newtemp)))

          `(letrec "gtn" 
              ,y... 
              ,temp...
              ,var1-init...
              ,@ body...))


At expansion time, we can now encounter bindings like this one:

     returned by hygenic-binding-of:
     +----------------------------------------+--------+
     |  native      |   import      |   uses  |  name  |
     |  definition  |   definition  |   list  |   |    |
     +-----|---------------|-------------|----+---|----+
           |               |             |        V
           |               V             | (newtemp . gensym-newtemp-1)
           |               #f            V
           |                             ()
           |
           V
        value of
        (binding-of newtemp)
        from the transformer


If we encounter a free use of such a binding, then recursively treat 
it as the binding stored as the native-definition of that binding.

If we encounter a binding use of such a binding, then replace the
binding occurence with the CDR of the name (GENSYM-NEWTEMP-1 in this
case) and replace nested occurences with the synthetic identifier:


     synthetic identifier
     +----------------------------------------+--------+
     |  native      |   import      |   uses  |  name  |
     |  definition  |   definition  |   list  |   |    |
     +-----|---------------|-------------|----+---|----+
           |               |             |        V
           |               V             |    gensym-newtemp-1
           |               #f            V
           |                             ()
           |
           V
        newtemp

Just as currently stated: If we encounter a variable use of the
synthetic identifier, treat that as a use of the name
(GENSYM-NEWTEMP-1), if we enounter a literal use, use the native
definition (NEWTEMP).


In effect, our LETREC example will be reduced to LET's like this:

  (let ((var1 <undefined>)
        (var2 <undefined>))
    (let (([[newtmp] . gensym-newtmp-1] init1)  ; hygenic-binding-of
          ([[newtmp] . gensym-newtmp-2] init2))
      (set! var1 [[newtmp] . gensym-newtmp-1])
      (set! var2 [[newtmp] . gensym-newtmp-2])
      body))

the inner let will expand first to:

    ((lambda (gensym-newtmp-1 gensym-newtmp-2)
       (set! var1 [newtemp . gensym-newtmp-1])   ; synthetic identifier
       (set! var1 [newtemp . gensym-newtmp-2])
       body)
     init1
     init2)

and finally to:

    ((lambda (gensym-newtmp-1 gensym-newtmp-2)
       (set! var1 gensym-newtmp-1)
       (set! var1 gensym-newtmp-2)
       body)
     init1
     init2)



    > Also, to correctly implement LET-SYNTAX and to generate alpha-ids, at
    > the very least we need to maintain a list of how repeated identifiers
    > are in the lexical environment so we can rename symbols to alpha-ids
    > (and probably need to just construct and maintain an entire
    > environment at expand-time, but I'll raise those issues later -- class
    > is almost over. :-)

Ok.

-t




reply via email to

[Prev in Thread] Current Thread [Next in Thread]