[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tern] greetings everyone -- rewriting, rewriting, rewriting.

From: david
Subject: [tern] greetings everyone -- rewriting, rewriting, rewriting.
Date: Thu, 21 Nov 2002 17:33:32 -0600
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20021003

I've changed the introductory message from the mailing list to be
more positive. It is better to define TERN in terms of what it may become than to define it in opposition to something, especially something peachy like the perl6 project.

It is better to define ANYTHING in terms of what is is rather than what it is not. That's
generally true.

The idea here is to develop a general purpose rewriting system that can be used to prototype procedural computer language features. Ideally, the target of the rewrites
might end up as something low-level enough that TERN could become a
competitively efficient compiler system. If this doesn't happen at first that is okay.

We (including you, since you joined the mailing list or are reading this in the archive) are developing a general purpose language prototyping environment based on the
linguistics or AI concept of "rewriting."

The TERN rewriting system will hopefully allow any procedural computer language to be described and implemented.

What is rewriting? Rewriting is when you have a statement, and you transform it through "rewrite rules" until it is different -- simpler, usually -- and then you work with the result of the rewrite rule set. It is what makes the e-mail transfer agent "sendmail"
so tricky to configure.

Chomskian linguists maintain that the human mind interprets
human language through rewrite rules, eventually mapping speech to "deep structures" which represent reality. The TERN project will attempt to eventually deliver a system for experimenting with computer language features by altering rewrite

To get there we need to pass several milestones, and answer several questions. Such as "what is reality?" which sounds far, far too heavy, but in following the metaphor of Chomskian theory (as alleged in the previous paragraph) we know that reality is what the deep structures map to. For TERN's purposes, that reality will be a virtual machine with a flat coding architecture, key-value mapping, and a flexible scalar data type, and stacks. Or some other small set of features. The important thing about the target
virtual machine is that the feature set is very small.

Then there are the blocking primitives, which are the syntax of the language being implemented with the rewrite rules. All the Algol languages use curly braces, except for Pascal, which uses matched "BEGIN/END" tokens because it was optomized to make it easy to grade. Python uses indentation. We need to come up with a generic way to describe blocking -- collecting strings of tokens into blocks -- or at least a standard place to plug the blocking action into the TERN process. I've been reading XML documentation for a contract I'm working on, and thinking that XML makes a perfectly good stupidest possible, internal-use-only intermediate code for parsing things into before looking at
them in any greater detail.

Once we have grouped our input into blocks of expressions of tokens, or something else, then we are faced with the question of what do these blocks instruct our virtual machine to perform. This in answered by applying rewrite rules until there is nothing left but low-level primitives (down to the target level) and then these are processed, by giving them to the virtual machine that understands the language that they have been
rewritten too.

It is also possible to defer understanding of what something is supposed to mean until the knowledge is required, it is also possible to tag additional meaning (such as, "only ever used in numeric operations" or "this will always be a method call on an object of type socket ") onto tokens and do some bindings earlier, or later, than is done with
other languages.  How to do this is still up in the air.

How This All Interacts With Perl:

Early TERN implementations will be written in perl, as "source filters" that interpret Perl and produce an equivalent subset of Perl as a target virtual language. Slightly later implementations might produce Inlineable C blocks. Or simply long C programs.

The idea of "compiler as a set of rewrite rules" provides a flexible compiler paradigm
and TERN might become a front-end to the GCC compiler system.

Once you have a compiler that uses an external set of explicit rewrite rules rather than procedural compilation, modifying aspects of the modeled languages may be easy. Want a
new feature? write a rewrite rule to provide it.

How We Got Here

Consider the problem of implementing co-routines in Perl. A coroutine is a special kind of subroutine that saves state within it between calls. in OO systems, you can set this kind of thing up explicitly by initializing a new object of some kind and then repeatedly calling a result generating method of that object. This works fine and there in no problem with it, but sometimes intrepid individuals like Damian Convay or Uri Guttman would actually find it preferable to just throw "yield $x" in there instead of "return $x" and have the state of all variables local to the routine saved somehow and have execution pick up at the next statement following the "yield" the next time the routine is called. "Action at a distance!" the critics wail, those that understand the implications at least, and they are right. Well you only have AAAD problems if there's one stash per routine, because a routine might get tickled from two unrelated threads. So the AAAD problems can be resolved by keeping a set of stashes keyed by the source of the invocation. The problem is, in order to get all that information in Perl, you have to wrap your routine in a closure, including an exit/entry point at each yield statement. In effect, you have to provide the initialize and later generate paradigm, but hide it all. And there were some other negative ramifications of using closures that I don't recall right now, but the end result was that in order to provide a "yield" (via source filter) that works correctly, the source filter has to understand and reimplement a considerable set of blocking and
conditional statements.

So if you have to parse and rewrite to provide one feature, why not make the general problem Parse And Rewrite and see what other problems can be solved from that

My ride is waiting so I have to run along, but that's more or less what it's about; awaiting comments and critiques. Eventually we'll all have enough spare time to do EVERYTHING.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]