[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [tern] more notes -- call this RFC two?

From: david nicol
Subject: Re: [tern] more notes -- call this RFC two?
Date: 28 Nov 2002 07:03:49 -0600

Good morning!

On Thu, 2002-11-28 at 04:24, Luke Palmer wrote:

> I think we should have a clearer identifier than just "means"
> somewhere in the middle of a line.  It would make parsing easier, and
> it would certainly be easier to read.  Perhaps:
>   macro PAT means PAT
> Then uninitialized people :)  will know it's a macro and what it, er,
> means. 

nothing ever has to parse TERN except the TERN parser.  Since the
TERN parser gets loaded with nothing more than blocking rules (how
the input nests, what matches) and is designed to identify single
keywords and treat them differently, a single keyword is sufficient.

Embedded TERN will look sufficiently unlike the host language that
there won't be any confusion.

> > in "means language" the text to the left of the "means" is the
> > pattern and the text to the right is the replacement text.
> Will the pattern be like a Prolog unification pattern, or something
> simpler than that? 

(googles for prolog unification patterns) 
Simpler,  because we're not building a knowledge set -- or are we? More
complex because we have to keep track of tagged symbols, and then we
have to include those tags in the list of rules to check.  I really
don't know prolog.

> > Each expression is run through all available pattern matches in the
> > order they have been defined, repeatedly, until nothing matches any
> > more.  The set of patterns to try is generated intelligently by
> > listing patterns by their triggering keywords, that's a sensible
> > optimization -- checking the while pattern against an expression that
> > does not contain the keyword "while" is a waste of time.
> I say this is too specific to specify now.  Design an algorithm that
> will handle that inherently  (I've got one cookin' in my brain).

Super!  Given the whole tension between concrete and abstract, I think
it's good to specify things as soon as you have a clear idea of how
they work.  We're in a "notes" phase, even if Savannah is going to
archive this discussion until the end of time --  I agree, the
implementation details are best left for the later chapters of the final
manual, and we tell the beginners that we have a magically efficient
way to check all the rules all the time.

> > I think its pretty clear that we need to name types explicitly to
> > prevent confusion between what is supposed to be TERN rewrite syntax and
> > what is supposed to get matched.  So variable elements on the left
> > hand side are (type, name) pairs, instead of having their types implied.
> And can you possibly expect to parse this?
>     my var $var = new type Type

i'm not sure what language that's supposed to be in --- setting up
a preliminary set of rules for what is a VARIABLE is a project; at this
level we're discussing what the language the rules are in looks like.

> You said you managed to build a heirarchial list of syntax?  How about
> something like:
>     =[ NAME ; ->[ TYPE ; 'new' ] ]
> That has the advantage of matching both:
>     my $var = new Foo;
> and
>     my $var = Foo->new;
> which, according to Perl, are the same thing.
> It has the disadvantage of being ugly.

at least two possibilities here:

        1: both cases need to get matched

        2: because we're interpreting all the way to compilation,
           equivalent expressions will get rewritten and matched
           after they're rewritten.  The patterns are subject to
           all the rules in effect at the time they are considered,
           so if we have

        BAREWORD one BAREWORD two means two->one ;

           as part of the assumptions of the environment, for the
           TERN implementation of Perl (the compiler) they will both
           get normalized to where the later rule can recognize them.

Context is another dimension that needs to appear. Some patterns
only are available in certain contexts.  Let's reserve CONTEXT as a
TERN primitive for terms in patterns that specify when rules are
valid.  If the TERN primitives are all shouts, we could have MEANS and
CONTEXT and the first thing I'd do is declare

                means MEANS MEANS;

to bring the noise down.

The question then arises of what the parsed language's universal object
will be called. OBJECT is good, or we could enshrine the perl
documentation and have THINGY.  Or compromise and have OBJECTTHINGY.

How to recognize a simple variable in Perl:

        $ BAREWORD b means b-isa-SCALAR 
        @ BAREWORD b means b-isa-ARRAY
        % BAREWORD b means b-isa-HASH

let's restrict ourselves to one rule per line, and insist on
lined source code, and do away with the trailing semicolon.  That
makes a TERN rule more like a C macro definition.  It also means that
we can't embed them inside literals, which is not a problem.  Literals
go away on the first pass through the source code.

Maybe for working with Perl we could start by deparsing in order to have
the source normalized somewhat, instead of accounting for all the WTDI.
But then we might not be any more advanced than the Inline::PERL module.

> > Using dashes to append qualifiers to type names will work to chain
> > qualifications -- if things of type "OBJECT" turn out to be imeplemtned
> > as 
> > 
> >     OBJECT means THINGY-isa-Object ;
> > 
> > or if OBJECT is the fundamental type in our OO universe, we don't care.
> > 
> >     BOX would become THINGY-isa-Object-isa-Box and some later
> > code that would trigger this rule might look like
> > 
> >     my $button = new Box $x1,$y1,$x2,$y2;
> > 
> >     draw $button;
> > 
> > The parsing pass identifies the
> > 
> >             my VARIABLE = new TYPE_NAME
> I don't understand this "tag" thing.  What do you mean?  How is
> THINGY-isa-Object different from any other identifier?

it's three tokens in TERN.  THINGY is the reserved universal class,
OBJECT is derived from that, possibly to differentiate OBJECTs from
INTRINSICs and LITERALs.  It's an open question: what is a good
class hierarchy for analyzing any (procedural programming) language?
Maybe we can defer that into rules and declarations rather than
offering a variegated bestiary up front.  I'm partial to one reserved
word ("means") and one operator (the dash) and seeing how far we can
get with those.  Oh yeah -- ran into CONTEXT a few paragraphs back.

THINGY-isa-Object is the definition of OBJECT, according to the
        OBJECT means THINGY-isa-Object

The conjunction 

has special meaning to TERN, which is that it declares object-oriented
class inheritance for purposes of matching patterns.

I don't think isa is the only relationship we'll want to track.  hasa
is the other big one.

I think we can allow arbitrary absolute base classes to be defined,
with unity rules:

        THINGY means THINGY

means that THINGY is from then on valid as a tag type.  The TERN
parse tree will be able to add tags to nodes.

> > Since we're breaking down the program this far, restoring it to Perl
> > seems silly when we could just as easily write it out as C or as NASM,
> > if we succeed in interpreting Perl code into a flat space with named
> > entry points and fully expanded code
> This is the main reason I'm actually participating in this project.
> There needs to be a Perl compiler.  There I<needs> to be.  (One that
> doesn't just copy the interpreter and the bytecode)
> Luke

And I worried that I should have deleted that conclusion since the
paragraphs I wrote above it didn't seem to be leading to it in any
direct way.

Well defined goals are important.  OTOH, experience has demonstrated
that one often does not get where one defined one's goals to be before
starting journeys.  This leads some people to be reluctant to set goals
because explicit goalsetting seems countereffective.  These people 
keep the therapy industry humming along.

Goals include:

        framework for language design.

        self-interpreting:  the TERN inTERNals (ha ha) should be
        written in TERN before we figure how to shoehorn all those
        weird edge cases from Perl into context-sensitive rewrite rules.

        expand source code into simpler primitives

        provide a macro language for Perl

        provide early-bound multiple dispatch for Perl; or rather,
        deliver an extended perl that has that feature

        provide a framework for provision of absolutely everyone's
        little pet projects from the initial perl6 RFC burst

        by abstracting blocking syntax out of the pattern matching,
        provide a common framework in which multiple scripting languages
        (perl, python, ruby, etc.) can share objects more effectively

        use the Gnu Compiler System intermediate format (which
        supposedly exists but needs to be researched) instead of writing
        to i386 or as or C or C++ or Parrot VM or ...)

        By rewriting all control flow operations into conditionals and
        goto-label operations, make all back ends similar

        a way to ease translation btn python and perl


        abstract syntactical correctness to a higher level, allowing
        reinterpretation towards sense, in ambiguous situations

        provide even later binding scripting, in which blocks that
        do not have control flow pass through them are not checked for
        syntax beyond their blocking

        provide an intermediate Perl in which all subroutines are
        rewritten into Inline C subroutines, which are then run,
        in effect providing a cacheing JIT

        fame, fortune, and lecture engagements for all involved


reply via email to

[Prev in Thread] Current Thread [Next in Thread]