[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Help-bash] Announcement: Started project to parse bash script to AS
From: |
Andy Chu |
Subject: |
Re: [Help-bash] Announcement: Started project to parse bash script to AST. |
Date: |
Sat, 22 Jul 2017 10:13:34 -0700 |
Hi Mike,
You might be interested in my project Oil, which has a well-tested bash
parser. It parses the program up front in a single pass, rather than
interleaving parsing and execution like bash and every other shell does. I
gave a summary in this comment (it was partially inspired by the bash AOSA
book chapter):
https://news.ycombinator.com/item?id=14550523
The blog goes into detail:
http://www.oilshell.org/blog/
The first 10 or 20 posts are all about parsing, including what algorithms
are used, and testing it on about a million lines of real bash code found
in the wild.
I've also been in contact with the author of this shell formatter, which
does a similar thing, for a different purpose: https://github.com/mvdan/sh
My parser is hand-written, but I would be interested in specifying it in a
meta-language more suitable than Yacc. Yacc is not the right tool for it.
For one, top-down parsing suits the shell better than bottom-up.
What I found is that if the lexer is relatively sophisticated, then the
parser doesn't need to be very powerful.
I had come across the Marpa parsing algorithm in my search. The general
idea I got was that it was more powerful and sophisticated than I need, and
maybe it trades off memory for speed. But I could be wrong about that.
However, my parser is also linear time, and it needs just a single token of
lookahead. (The parser asks for tokens in one of 13 lexical modes.)
I'd be interested in hearing more about the Marpa algorithm and how it
relates the bash use case.
Also if you want to try it, you can download and run it with "osh -n
foo.sh". It will give you a pretty-printed AST as in this post:
https://www.oilshell.org/blog/2017/01/21.html
I think the instructions for setup might be a little bit out of date -- if
so let me know:
https://github.com/oilshell/oil/wiki/Contributing
I'm about to make a release which will have easier instructions (configure,
make, etc.)
Andy
On Fri, Jul 21, 2017 at 8:16 PM, Mike Mestnik <address@hidden>
wrote:
> Off list, please cc me on reply.
>
> [1]Abstract Syntax Tree, to be used to beautify or minify bash source.
> It's a work in progress but already the [2]code passes a handful of
> example [3]tests. I'd appreciate a collection of small, less than 5
> lines, examples that express the various edge cases supported by the
> bash shell. If the maintainer of a online bash script testing site
> wouldn't mind running a DB query for me that would be awesome. I'm
> also looking for a volunteer or two that can assist with development
> in perl.
>
> Thank you.
>
> 1. https://en.wikipedia.org/wiki/Abstract_syntax_tree
> 2. https://github.com/cheako/MarpaX-Languages-Bash-AST/blob/
> 9952b3e0f5a3b80aab2f9801c9e6a745350ffb83/lib/MarpaX/Languages/Bash/AST.pm
> 3. https://travis-ci.org/cheako/MarpaX-Languages-Bash-AST/
> builds/256283379#L331
>
>