cfengine-develop
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Cfengine-develop] New Parser/Lexer PoC


From: Luke A. Kanies
Subject: [Cfengine-develop] New Parser/Lexer PoC
Date: Mon, 3 Mar 2003 21:42:07 -0600 (CST)

Okay, just to see what it looked like, I mocked up a new lexer and parser.

I'm no wizard with either, and only just learned lex and yacc (the little
I know about them) last month or so, but I figured I'd give it a go.

Before you look at them, there are a couple of key points to make about
them:

First and foremost, there is absolutely no code in here.  This is _just_ a
mockup to see what it might take to correctly parse the syntax.  I'm not
even going to say that flex/bison/whatever will correctly parse these,
because they're not to that point.

Second, this was not written from the perspective of trying to provide
equivalent functionality to the current parser (which was probably the
logical thing to do); instead, it was written from the perspective of
someone just interested in doing a clean-room reimplementation of a
parser.  As such, there are probably lots of problems with it, especially
related to the subtleties in the current parser.

Third, this mockup was specifically designed to not contain any
information about the code, just about the format of the language.  That's
usually my goal in a parser, and I think it's a reasonable one.

Fourth and maybe most importantly, I can nearly _guarantee_ you that these
won't work, either by not correctly parsing files or by allowing invalid
syntax through (which are really the only ways for parsers to fail,
right?).  This is because the parser may have to know more about the code,
and especially about the different actions, than this mockup does.

Given that, I think they're a valuable learning tool.  For one, I found
what I consider to be nine (9!) different syntaxes in cfengine, for what
looks like 29 different types of actions.  That's a lot of uniqueness, and
I'm not convinced it's totally necessary.  Obviously, any changes at this
point would break compatibility, but I would very much like to see the
number of allowed syntaxes reduced; this would significantly reduce
the complexity of the language and the associated learning curve.

Maybe if an inline module interface were added, the existing actions could
slowly be ported over to be inline modules, and have their syntaxes
changed to a standard one at the same time.

This begs the question, though:  Is it possible to reduce the number of
syntaxes down to only one?  I think it's possible to get pretty close; I
could go through the list, but I think some of the syntaxes aren't unique
enough to warrant separation, and currently have to be separated for
parsing.  If we could get it down to one, though, then we'd _really_ have
a simple, consistent language!

These also make it somewhat obvious how hard some aspects of the cfengine
language are to parse.  For instance, setting the actions off by braces
would simplify things considerably, as would having an end-of-line marker
like ';'.

So, with no further ado, you can find the lexer and parser here:

http://madstop.com/~luke/cfengine/cflex2.l
http://madstop.com/~luke/cfengine/cfparse2.y

They aren't terribly well documented, either.  I can do so if people think
this is a worthwhile experiment to explore, but I don't want to spend much
more time on this if it's just going to be thrown away (which is what I
expect to happen).

Enjoy!

Luke

-- 
Measure with a micrometer.  Mark with chalk.  Cut with an axe.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]