help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How does quote removal work with alternative forms of parameter expa


From: Philippe Cerfon
Subject: Re: How does quote removal work with alternative forms of parameter expansion?
Date: Mon, 27 May 2024 03:54:49 +0200

Hey Lawrence

On Mon, May 27, 2024 at 2:01 AM Lawrence Velázquez <vq@larryv.me> wrote:
>         For the four varieties of parameter expansion that provide
>         for substring processing (see Section 2.6.2), within the
>         string of characters from an enclosed "${" to the matching
>         '}', the double-quotes within which the expansion occurs
>         shall have no effect on the handling of any special characters.

Okay that's also new in the draft.

So that supposedly means that if there's a parameter expansion with #,
##, % or %% within(!) double quotes, that enclosing (outer) double
quotes (and not any inner quotes) have not effect on any inner special
characters, like the pattern matching notation meta-characters * ? [
etc. but *also* other quoting characters like ' " \ and $', right?
And conversely, it probably means, that these outer double quotes *do*
have an effect on all those if it's not the #, ##, % or %% forms, but
the :+ - = ? forms, right?

What a cryptic way to define that.

So if *hypothetically* tilde expansion were to happen in the pattern
of the #, ##, % and %% forms (which it doesn't), then by that
paragraph it would indeed also happen within the outer " ... ".

> > What I still don't get are the :- :+ etc.:
> >
> > For the the current POSIX says:
> > "word shall be subjected to tilde expansion, parameter expansion,
> > command substitution, and arithmetic expansion." so no quote removal,
> > but the most recent draft already corrected that as an error and says:
> > "word shall be subjected to tilde expansion, parameter expansion,
> > command substitution, arithmetic expansion, and quote removal."
>
> That only applies to unquoted parameter expansions.  Tilde expansion
> and quote removal do not occur in double-quoted parameter expansions.

They could have also just written that there, too, and not requiring
people to cross link that with 4 chapters earlier, from where it also
follows only indirectly. ^^


> $ cat /tmp/foo.sh
> unset foo
> printf '%s %s\n' "${foo-~}" "${foo-'bar'}"
> $ for sh in bash dash ksh mksh zsh; do "$sh" /tmp/foo.sh; done
> ~ 'bar'
> ~ 'bar'
> ~ 'bar'
> ~ 'bar'
> ~ 'bar'
>
> Section 2.2.3 should clarify this.
>
>         For parameter expansions other than the four varieties that
>         provide for substring processing, within the string of
>         characters from an enclosed "${" to the matching '}', the
>         double-quotes within which the expansion occurs shall
>         preserve the literal value of all characters, with the
>         exception of the characters double-quote, backquote,
>         <dollar-sign>, and <backslash>.

Ah. I think this is what I was looking for.
So that describes what happens if one uses the :- + $ = forms, right?
And it basically says the " ` $ and \ preserve their special meaning,
just the same as they would have when the directly appear within "..."
with not parameter expansion, right?


> This requires that, for example, the single quotes in "${foo-'bar'}"
> be treated literally rather than as quoting characters.

But again: all that only in the + - = ? cases - and not in # ## % %%.
>
>         If any unescaped double-quote characters occur within the
>         string, other than in embedded command substitutions, the
>         behavior is unspecified.
>
> This leaves the behavior of "${foo-"bar"}" up to implementations.

Ah, that explains what I was wondering when replying to Koichi's mail,
where I noticed the differing behavior between bash and dash:
(non-backslash-escaped) " cannot portably be used in such case.

So any "${var-word}" where word contains " is generally a boo boo (if
trying to be portable). But using \" would be ok.

I guess it would then also be possibly reasonable behavior (and
allowed), if an implementation says that something like "${var-"foo"}"
is a syntax error, cause it kind of splits the parameter expansion
into three strings?


>         The backquote and <dollar-sign> characters shall follow the
>         same rules as for characters in double-quotes described in
>         this section.
>
> This requires that command substitutions, parameter expansions, and
> arithmetic expansions work as they usually do within double quotes.

I assume that the above text, where it says that <dollar-sign> is
handled specially does *not* include $', right?
In other words, a $' ... ' within the word of a ${var-word} expansion,
but only if that whole expansion itself is enclosed in " " would be
taken literally any not as dollar-single-quotes quoting.

Perhaps that should be added to the text in POSIX?

>
>         The <backslash> character shall follow the same rules as
>         for characters in double-quotes described in this section
>         except that it shall additionally retain its special meaning
>         as an escape character when followed by '}' and this shall
>         prevent the escaped '}' from being considered when determining
>         the matching '}' (using the rule in Section 2.6.2).
>
> This requires that backslashes work as they usually do within double
> quotes, except that they also escape closing braces.  This means
> that a backslash is not removed unless followed by a dollar sign,
> backquote, backslash, newline, or closing brace.

I see. Makes sense now.


> > But above with ##, single quotes *did* prevent expansion. Why not here:
> >
> > x="${1:-'$bar'}"  |  bash x.sh ""   |   'expanded'  => why not quote
> > removal, why expansion?
>
> I don't know.  Bourne compatibility, perhaps.

I'd have said this makes sense now:
In the # # % %% cases, the word is handled like it's own independent
word where quoting is performed, so there the ' ... ' would be single
quotes quoting, and thus removed.
But in the :- + ? = cases, the ' ' are, by the rules you cited,
literal, and the $bar is of course expanded within the outer " ... ".

So now, it seems to be perfectly reasonable?


> > And what's even more weird, above, there is no quote removal (of the
> > single quotes), but below there IS:
> > $ printf '%s\n' "${unset:-$'$bar'}"
> > expanded
>
> I don't understand this specific behavior, but these quotes are not
> removed in POSIX mode.

That's the specialty that Koichi mentioned, extquote.


So long story short I would summarize as follows:
1) With # ## % %
- The word (RHS of # ## % %) does it's own unquoting/quote removal
step, independent from any outer " " that surrounds the parameter
expansion.
- The various quoting styles within word, work as normal, e.g. '$bar"
will case the literal $bar while "$bar" will cause bar to be expanded
- Any special character (for pattern matching notation, like * ? [   )
that was directly quoted (like in \* or '*' or "*", looses it's
special meaning. Similarly if such character results from a double
quoted variable expansion like in wo_"$var"_rd it also looses the
special meaning, but not if it results from a bare unquoted variable
expansion within the word part.

2) With :+ :- + - etc. forms:
- The word part (the RHS of the :+ etc) behaves like as if it would be
directly within a double quoted string (and no surrounding ${var:+
}, with the exception that \ also escapes } as a literal character.
Other than that \ escapes " $ ` \ to their literal value. An unescaped
(by \) $ introduces expansions/substitutions. The quoting characters '
and $' are taken literally.
- Within the word part (the RHS of the :+ etc), a " cannot portably be
used unless preceded by an escaping backslash.

Does that seem right?

Thanks for your help guys :-)

Philippe.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]