mit-scheme-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [MIT-Scheme-devel] Keywords


From: Joe Marshall
Subject: Re: [MIT-Scheme-devel] Keywords
Date: Fri, 19 Mar 2010 12:03:35 -0700

>> From: Taylor R Campbell <address@hidden>
>> Date: Tue, 16 Mar 2010 02:23:51 -0400
>>
>> What advantage does a disjoint data type have over writing (foo
>> 'bar: baz 'quux: zot)?  [It] strikes me as needless complication to
>> the language.

I agreed with you for very many years, but a few years back I changed
my mind.  I'll try to show you why.

>> Using keyword objects rather than (non-keyword) symbols as the
>> arguments to keyword parameters makes sense in Common Lisp only
>> because of its package system.

It certainly arose that way.

The package system started out as a hack
to allow different unrelated Lisp applications to share the same image
without interfering with each other.  The problem is that symbols are
heavily overloaded in old lisp systems.  In the earliest lisp, a
symbol had an associated `property list' which stored everything you
needed to know about a symbol.  This even included the symbol's
`value' property and possibly its `subr' property (function cell).
In this model, a symbol is actually a named set of mappings.  The name
itself had no ontological significance (it was stored as the 'pname'
property), it was just a way of getting ahold of the mapping set.

Two different systems would likely make different uses of the
properties of a symbol, so trying to load two different systems into
the same image wouldn't work.  The root cause of this problem is very
tricky to understand, and I'll get to it in a moment, but the
`obvious' problem appears to be one of aliasing:  both System `A' and
System `B' refer to symbol `foo', but they aren't referring to the
*same thing*.  The obvious solution is to give each system a private
symbol.  Symbols were interned in the obarray, so a hack that
allowed you to use multiple obarrays and switch between them was
created.  The `multiple obarray hack' was the precursor to the
`package system'.

Although the value and function cells of symbols are firmly
entrenched, there has been a shift away from the practice of using
property lists.  When a symbol is used only as a key in a table, there
is less need for collision avoidance.  On the other hand, there is an
immediate problem that the symbol 'foo' in package A is *not* the
symbol 'foo' in package B, even though they look the same.  The
keyword package comes to the rescue.  Keywords are good as `symbolic
keys' because they *don't* have values, functions, or properties.  You
can use them freely without worrying about collisions.

So keywords arose as a solution to a problem that was an unintended
effect of a solution to an unrelated problem.

-----

Earlier I mentioned that the root cause of the problem was tricky to
understand.  The cause isn't really one of aliasing unrelated symbols,
the cause is a misunderstanding of naming.  (Unfortunately, I haven't
found a good authority on the issues of naming, so I'm going to fly by
the seat of my pants here.)

Abstractly, we want to be able to refer to objects by names.  We do
this by establishing a `context' where names are associated with
values.  The minimal requirement is that the `name' can be an
arbitrary sort of thing (like the word "foo") and that the `context'
is what provides the mapping.  We also desire that the mapping is
relatively static unless told otherwise.

We do this *all the time* in computer science.

The root cause of `symbol collision' is that early lisps were
confused.  A `symbol' in these lisps is not a name at all, but a very
complex sort of beast.  At a very early phase in interpreting a
program, a symbolic token (like what a user types) is mapped to a
`symbol object' via the obarray.  Further interaction with the symbol
is mediated through the `property list'.  A picture of this would sort
of look like this:


                      obarray               plist
    symbolic token ------------>   Symbol ---X----> Value
                                             ^
                                             |
                                          Symbol

So a symbol *itself* is really a function from another symbol to a
value.  This part of the picture:


                    plist
                -----X----> Value
                     ^
                     |
                   Symbol

But that is a terrible idea.  What you *want* is the *other* half of
the picture:

                    obarray
  symbolic token ------------>  Symbol >---


And the `symbolic token -> Symbol' mapping via the obarray can be
considered an implementation detail of the reader, we just want this
part:

               Symbol >----

which, when we later supply a context becomes

                        context
               Symbol >---------> Value

The keyword package in Common Lisp accomplishes this by essentially
erasing the components of a symbol that are causing the problems and
using the remaining shell as a stand-in for the conceptual symbol.
(This last step is beautiful kludge.  The original idea of a symbol
had all these bells and whistles, but if we plug these holes and slap
on some Bondo and paint it, it'll look enough like a platonic solid to
pass.)

Ok, now back to Scheme.

Scheme programmers have long recognized that symbols really are just
names that can be associated with any sort of object whatsoever by
simply making a context to perform the association.  Hash tables and
alists are specifically provided for this.

As many have asked ``What's wrong with just quoting symbols?''  I'll
agree that there is nothing that can be accomplished with keywords
that cannot be accomplished with the right amount of quoting.  And
aren't symbols *supposed* to be the thing you use to name random
things?  Isn't that their raison d'etre?  Well, yes, that is true.

But to see why it is worth having keywords, you need to look at
situations where the behavior of a keyword is different from the
behavior of the equivalent symbol.  That involves meta-programming.
A keyword is self-evaluating *and* it has a readable syntax.  You can
use keyword literals in a program.  A symbol, on the other hand, is
interpreted as an identifier in a program.  You *cannot* embed a
symbol literal in a program *except* by using QUOTE.  QUOTE is a
meta-operation that tells the evaluator to not act on the list
structure embedded in the code.

Let's imagine for a moment a version of Scheme where literal
numbers were not self-evaluating but instead were evaluated by calling
UNHASH on the numeric value.  So you could refer to previous typeout
directly by number.  (A nasty thought, but imagine.)
=> "foo"
; Value 1: "foo"

=> "bar"
; Value 2: "bar"

=> (string-append 1 2)
; Value 3: "foobar"

=> (string-length 3)
; Value 4: 6

Now you'd have to quote numbers if you wanted to use them to really
mean themselves:

=> (+ 4 '4)
; Value 5: 10

Now we turn to metaprogramming.  We want to write a program that
writes a program that manipulates numbers.  If we want to use a
numeric literal, we have to remember to put the correct amount of
quoting around it.  This is because the QUOTE we use in our
meta-program protects us from the meta-level evaluation, but not the
target-level evaluation.

Here is a real-world example of a very hairy macro.  I am defining a
`workspace-predicate' which returns a list of workspaces that satisfy
specified criteria.  At the macro call site, I invoke

(cmctl-ro-get-workspace-predicate :returns (:ws-id :user :path)
                                  :product product-var
                                  :user user-var
                                  :vpb-changes vpb-changes-p-var ;; ****
                                  :description-search desc-var)

The line marked with a star causes this code to be invoked:
    (trinary-query vpb-changes? vpb-changes vpb-changes-p)

Which causes this code to be emitted in the final macro expansion:
    (LET ((VPB-CHANGES VPB-CHANGES-P-VAR)
          (#:NOT-RESULT (NOT VPB-CHANGES-P)))
       (ECASE VPB-CHANGES
         ((NIL) T)
         (:YES (NOT #:NOT-RESULT))
         (:NO #:NOT-RESULT)))

The definition of the `trinary query' submacro is this:
    (macrolet ((trinary-query (query-arg? query-arg query-result-var)
                 (let ((not-result-var '#:not-result))
                   `(when ,query-arg?
                      `((let ((,',query-arg ,,query-arg)
                              (,',not-result-var (not ,',query-result-var)))
                          (ecase ,',query-arg
                            ((nil) t)           ; don't care
                            (:yes (not ,',not-result-var))
                            (:no ,',not-result-var))))))))

Notice in particular the first binding form in the most interal let
binding:
    (let ((,',query-arg ,,query-arg)

This establishes a binding of a variable whose name and value are
taken across meta-levels and whose original symbolic name is supplied
two meta-levels up.  The weird ,', and ,, prefixes control the level
of quoting that is used.  Each level of expansion causes a level of
quoting to be `stripped'.

Now notice the two ecase clauses:
     (:yes (not ,',not-result-var))
     (:no ,',not-result-var))))))))

The tokens `:yes' and `:no' don't need quoting or unquoting or any
careful attention to what level of expansion they will be quoted at.
They are just literal symbolic tokens that get no special treatment by
the evaluator just because they are symbols.

As I said before, one could do without keywords and just use symbols,
and then you'd have to think a bit harder about which ones to quote
and unquote at which level, but you could do it.  It is just a
convenience.  But then, LET is simply a trivial convenience over using
LAMBDA.

>> What advantage does a disjoint data type have...

Actually, being disjoint is irrelevant.  I don't care if keywords
count as true under symbol?.  What is important is that they don't
count as identifiers under EVAL or macro expansion.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]