bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Supporting structured data (was: Re: bug-bash Digest, Vol 238, Issue 2)


From: Martin D Kealey
Subject: Supporting structured data (was: Re: bug-bash Digest, Vol 238, Issue 2)
Date: Wed, 7 Sep 2022 17:19:03 +1000

Some things do indeed come down to personal preference, where there are no
right answers. Then Chet or his successor gets to pick.

Keep in mind that most or all of my suggestions are gated on not being in
backwards-compatibility mode, and that compat mode itself would be
lexically scoped. With that in mind, I consider that we're free to *stop*
requiring existing antipatterns that are harmful to comprehension or
stability.

I would choose to make parsing numeric expressions happen at the same time
as parsing whole statements, not as a secondary parser that's always
deferred until runtime. This would improve unit testing and debugging,
starting with bash -n being able to complain about syntax errors in
expressions. (Yes that precludes $(( x $op y )) unless you're in compat
mode.)

On Mon, 5 Sept 2022 at 19:55, Yair Lenga <yair.lenga@gmail.com> wrote:

> Personally, I think adopting Javascript/Python like approach (${a.b.c} )
> is preferred over using Perl approach ( ${a{b}{c}} ), or sticking with the
> existing bash approach. The main reason is that it is better to have a
> syntax similar/compatible with current/future directions, and not the past.
>

By having var['key'] and var.key as synonyms, Javascript already sets the
precedent of allowing multiple ways to do the same thing.

But if you look closely, there's a difference in Javascript between
var['key'] and var[key], which cannot be replicated at a syntactic level in
Bash. Instead we have to rely on a run-time lookup of  'var' to determine
whether it's an associative map or a normal array.

That leads to bugs involving delayed and randomized surprises, whereas an
unexpected syntax goes "bang" right away, when the coder is looking. (It
might surprise them, but it won't surprise their customers.)

I believe that "helping people to avoid writing bugs" trumps "matching the
syntax suggested by common practice in other languages", and so I conclude
that it's preferable, from a code resilience point of view, to have a
syntactic difference between a numeric expression used for indexing and a
string used as a key in a map lookup.

IMHO ${var[$key]} is incompatible with *good* "future directions";
${var[stuff]} should be reserved for numeric indexing and slicing (once
backward compat is turned off).

Bottom line - IHMO - no point in joining a losing camp (Perl), or having a
> "bash" camp.
>

If we need to institute a "bash camp" to improve resiliency, I wouldn't
lose sleep over it.

I'm not particularly wedded to the Perl var{key} syntax, and indeed for
"fields" I would prefer to avoid it, as the javascript syntax *is* nicer
and more widely understood.

But I think that *only *having "var.key" (and reserving "var[expression]"
for numeric indexing, as above) would lead to weirdness in other ways.
Consider if we allow ${var.KEY} where KEY is a shell word (minus unquoted
'.').

I agree, it's quite nice to write ${var.$var_holding_key} or even
${var.${var_holding_key:-default_key}}.

But then when you want to deal with more complex keys, we get things like
${var.""'} for an empty key, or ${var."$key.$with/$multiple:$parts"}.

Those might look obvious enough to anyone who's used the shell long enough,
but let's be honest, shell quotes are already hard for newcomers to
understand, and nested quoting are a nightmare.

But it's a horrendous mess when you want to assign:
var."some/complex/key"=$newvalue. IMO that over-extends the syntax for
assignment, making the rest of the parser intolerably complicated.

So may I suggest a compromise syntax: take the ${var.field} notation from
Javascript, and the {var.$key} as above, but also extend it to allow
${var.{WORD}} (which mimics the pattern of allowing both $var and ${var})
with the WORD treated as if double-quoted. Then we can write
${var.{complex.$key/$expansion}} and var.{complex.$key/$expansion}=value,
which are much more reasonable propositions for parsing and reading.

That leaves [] for strictly numeric indexing and slicing, so we don't have
to resort to run-time lookup to figure out whether it should have been
parsed as a numeric range expression (after we've already done so).

And it leaves space in the syntax of dot-things to add operators we've
haven't considered yet; perhaps operators to make globbing and
word-splitting opt-in rather than opt-out?

-Martin

PS: I mention using var[expression] for *slicing*; I want to be able to
write var[start..end] or var[start:count] and be sure that ${var[x]} and
${var[x..x]} and ${var[x:1]} all give the same thing, save for perhaps
giving an empty list if the element is unset.

And unlike ${var[@]:x:1}, which gives an unwelcome surprise if ${var[x]} is
unset. (This is one of the antipatterns inherited from ksh that we should
avoid; I would even argue for disabling it when not in compat mode.)

PPS: I'm under no illusions that it will take a LOT of work to move Bash
this far. But we'll never get there if we keep taking steps in the opposite
direction, piling on ever more stuff that has to be accounted for in
"compat" mode.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]