help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Help-bash] Announcement: Started project to parse bash script to AS


From: Mike Mestnik
Subject: Re: [Help-bash] Announcement: Started project to parse bash script to AST.
Date: Sat, 22 Jul 2017 19:33:16 -0500

I started my project because I was looking for a beautifier that would
break long lines, combine short lines, and generally do much more than
one can do with a handful of regexes as with
https://github.com/shri314/beautify_bash.  Oil might satisfy my
requirements, I didn't think I would find an AST generator that would
keep comments.

Despite the name my project is written in pure perl, it doesn't use
Marpa.  The name was chosen because that's where some of the other AST
generators are.  I did make a good attempt at using Marpa's scanless
interface, but found that it wouldn't easily(for me at least) handle
the most basic tasks like...

  * Parsing out a comment, it's only good if you want to /dev/null
comments.  I'm also needing Lossless Syntax Tree.
  * Splitting n words on n-1 delimiters.  It can only handle preceding
or trailing delimiters... at least I hope it can do that.

Failing to be able to do that I stopped any effort to use Marpa.  I
started writing a huge elsif tree, but was later shown Discrete Finite
Automaton and started using my own variant of that.

I've been working off of observation and thus far I've seen bash do a
number of things I wish it wouldn't like failing on both of these.

address@hidden:~$ case in in  a) ;; esac) ;;
bash: syntax error near unexpected token `)'
address@hidden:~$ case in in  a) ;; esac | a) ;;
bash: syntax error near unexpected token `)'

I forget what the other was, but when I discovered it I was totally
like if this was corrected there would only be a 0.001% chance anyone
has this code and depend on the current "do nothing" behaviour.  I
remember bringing it up on freenode, but I can't find a log for that
channel.   It didn't seem like anyone around at the time was keen on
taking up changing anything related to bash... as if there was never
going to be another version.

On Sat, Jul 22, 2017 at 12:13 PM, Andy Chu <address@hidden> wrote:
> Hi Mike,
>
> You might be interested in my project Oil, which has a well-tested bash
> parser.  It parses the program up front in a single pass, rather than
> interleaving parsing and execution like bash and every other shell does.  I
> gave a summary in this comment (it was partially inspired by the bash AOSA
> book chapter):
>
> https://news.ycombinator.com/item?id=14550523
>
> The blog goes into detail:
>
> http://www.oilshell.org/blog/
>
> The first 10 or 20 posts are all about parsing, including what algorithms
> are used, and testing it on about a million lines of real bash code found in
> the wild.
>
> I've also been in contact with the author of this shell formatter, which
> does a similar thing, for a different purpose: https://github.com/mvdan/sh
>
> My parser is hand-written, but I would be interested in specifying it in a
> meta-language more suitable than Yacc.  Yacc is not the right tool for it.
> For one, top-down parsing suits the shell better than bottom-up.
>
> What I found is that if the lexer is relatively sophisticated, then the
> parser doesn't need to be very powerful.
>
> I had come across the Marpa parsing algorithm in my search.  The general
> idea I got was that it was more powerful and sophisticated than I need, and
> maybe it trades off memory for speed.  But I could be wrong about that.
> However, my parser is also linear time, and it needs just a single token of
> lookahead.   (The parser asks for tokens in one of 13 lexical modes.)
>
> I'd be interested in hearing more about the Marpa algorithm and how it
> relates the bash use case.
>
> Also if you want to try it, you can download and run it with "osh -n
> foo.sh".   It will give you a pretty-printed AST as in this post:
>
> https://www.oilshell.org/blog/2017/01/21.html
>
> I think the instructions for setup might be a little bit out of date -- if
> so let me know:
>
> https://github.com/oilshell/oil/wiki/Contributing
>
> I'm about to make a release which will have easier instructions (configure,
> make, etc.)
>
> Andy
>
>
>
> On Fri, Jul 21, 2017 at 8:16 PM, Mike Mestnik <address@hidden>
> wrote:
>>
>> Off list, please cc me on reply.
>>
>> [1]Abstract Syntax Tree, to be used to beautify or minify bash source.
>> It's a work in progress but already the [2]code passes a handful of
>> example [3]tests.  I'd appreciate a collection of small, less than 5
>> lines, examples that express the various edge cases supported by the
>> bash shell.  If the maintainer of a online bash script testing site
>> wouldn't mind running a DB query for me that would be awesome.  I'm
>> also looking for a volunteer or two that can assist with development
>> in perl.
>>
>> Thank you.
>>
>> 1. https://en.wikipedia.org/wiki/Abstract_syntax_tree
>> 2.
>> https://github.com/cheako/MarpaX-Languages-Bash-AST/blob/9952b3e0f5a3b80aab2f9801c9e6a745350ffb83/lib/MarpaX/Languages/Bash/AST.pm
>> 3.
>> https://travis-ci.org/cheako/MarpaX-Languages-Bash-AST/builds/256283379#L331
>>
>



reply via email to

[Prev in Thread] Current Thread [Next in Thread]