help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Help-bash] Internal parsing & flow


From: cwh0803
Subject: [Help-bash] Internal parsing & flow
Date: Wed, 8 Jan 2014 00:28:26 -0500
User-agent: SquirrelMail/1.4.21

All-

I'm knee-deep in writing an embedded shell for a proprietary application;
my design is heavily inspired by Bash. To this end, I'd like to mimic some
of the parsing and processing that Bash does to provide a familiar
environment with predictable results.

I've waded through the source, trying to align code with comments and
discussions from the Bash Hackers Wiki, the bash man page, and an
assortment of other sources.

To the point:

The documents I've read indicate the string read from the prompt or file
is passed through quoting, brace-, tilde-, parameter-, command-,
arithmetic-expansion processors before being passed to the lex/yacc
parser. It would seem, based on the rules of the grammar, the input is
broken into WORDs, and collected into higher-order constructs (like the
flavors of command).

Question 1: The code would seem to indicate the various expansion
processors are performed on WORD_LISTs (I'm looking at
subst.c:expand_word_list_internal) -- how did the string read from the
user/script get broken into WORDs? -- I thought this happened *after* the
expansion processors in the 'word splitting' phase...

Along those same lines: Question 2: What is the definition of a WORD from
the lexers' perspective? I've gotten turned around a few times trying to
find the spec for WORD within the lexer [this may be a function of minimal
Lex experience]...

Question 3: Where are the flags of a WORD_DESC assigned? My understanding
is that these flags guide the semantics of later processing, so it is
imperative these flags be accurate...

Question 4: Is 'word-splitting' done by the lex/yacc parser? If no, where
is that implemented?

Question 5: Is expand_word_list_internal the entry-point to the various
expansion routines? Where are  escaped characters and quoting done, as
this function seems to cover the other expansions...

Question 6: (Related to Q1) If, for example, the input command line is
'echo "Hello World"', I'm expecting two WORD (or WORD_DESC) objects -- one
for 'echo', the other for the de-quoted 'Hello World'. If the original
line was split on whitespace, there were, at one point then, three WORD
(or WORD_DESC) objects -- where did [1] and [2] get merged?

OK; that'll do for now. Hopefully I'm not too far off base in my
understanding, and a few nudges in the right direction are all I'll need..

Thanks so much!

Carl




reply via email to

[Prev in Thread] Current Thread [Next in Thread]