help-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: How does quote removal work with alternative forms of parameter expa


From: Lawrence Velázquez
Subject: Re: How does quote removal work with alternative forms of parameter expansion?
Date: Sun, 26 May 2024 22:44:40 -0400
User-agent: Cyrus-JMAP/3.11.0-alpha0-480-g515a2f54a-fm-20240515.001-g515a2f54

On Sun, May 26, 2024, at 9:54 PM, Philippe Cerfon wrote:
> So that supposedly means that if there's a parameter expansion with #,
> ##, % or %% within(!) double quotes, that enclosing (outer) double
> quotes (and not any inner quotes) have not effect on any inner special
> characters, like the pattern matching notation meta-characters * ? [
> etc. but *also* other quoting characters like ' " \ and $', right?

Right.


> And conversely, it probably means, that these outer double quotes *do*
> have an effect on all those if it's not the #, ##, % or %% forms, but
> the :+ - = ? forms, right?

Right.


> So if *hypothetically* tilde expansion were to happen in the pattern
> of the #, ##, % and %% forms (which it doesn't), then by that
> paragraph it would indeed also happen within the outer " ... ".

Tilde expansion does occur in those forms, whether unquoted or
double-quoted.

        $ cat ./foo.sh
        foo=/var/root/whatever
        printf '%s %s\n' ${foo#~root} "${foo#~root}"
        $ for sh in bash dash ksh mksh zsh; do "$sh" ./foo.sh; done
        /whatever /whatever
        /whatever /whatever
        /whatever /whatever
        /whatever /whatever
        /whatever /whatever


>> > For the the current POSIX says:
>> > "word shall be subjected to tilde expansion, parameter expansion,
>> > command substitution, and arithmetic expansion." so no quote removal,
>> > but the most recent draft already corrected that as an error and says:
>> > "word shall be subjected to tilde expansion, parameter expansion,
>> > command substitution, arithmetic expansion, and quote removal."
>>
>> That only applies to unquoted parameter expansions.  Tilde expansion
>> and quote removal do not occur in double-quoted parameter expansions.
>
> They could have also just written that there, too, and not requiring
> people to cross link that with 4 chapters earlier, from where it also
> follows only indirectly. ^^

Feedback is welcome :)

https://www.opengroup.org/austin/


>> Section 2.2.3 should clarify this.
>>
>>         For parameter expansions other than the four varieties that
>>         provide for substring processing, within the string of
>>         characters from an enclosed "${" to the matching '}', the
>>         double-quotes within which the expansion occurs shall
>>         preserve the literal value of all characters, with the
>>         exception of the characters double-quote, backquote,
>>         <dollar-sign>, and <backslash>.
>
> Ah. I think this is what I was looking for.
> So that describes what happens if one uses the :- + $ = forms, right?
> And it basically says the " ` $ and \ preserve their special meaning,
> just the same as they would have when the directly appear within "..."
> with not parameter expansion, right?

Yes, exactly (with the addition of the "\}" escape).


>> This requires that, for example, the single quotes in "${foo-'bar'}"
>> be treated literally rather than as quoting characters.
>
> But again: all that only in the + - = ? cases - and not in # ## % %%.

Yes.  In the draft standard, this whole paragraph begins with "For
parameter expansions other than the four varieties that provide for
substring processing".


>>         If any unescaped double-quote characters occur within the
>>         string, other than in embedded command substitutions, the
>>         behavior is unspecified.
>>
>> This leaves the behavior of "${foo-"bar"}" up to implementations.
>
> Ah, that explains what I was wondering when replying to Koichi's mail,
> where I noticed the differing behavior between bash and dash:
> (non-backslash-escaped) " cannot portably be used in such case.
>
> So any "${var-word}" where word contains " is generally a boo boo (if
> trying to be portable).

Right.


> But using \" would be ok.

Right, since the behavior of "...\"..." is defined.

If you feel like going off into the weeds:
https://www.gnu.org/software/autoconf/manual/autoconf-2.72/html_node/Shell-Substitutions.html#index-_0024_007bvar_002dvalue_007d


> I guess it would then also be possibly reasonable behavior (and
> allowed), if an implementation says that something like "${var-"foo"}"
> is a syntax error, cause it kind of splits the parameter expansion
> into three strings?

It would certainly be conformant, yes.  Whether it'd be "reasonable"
would be a judgment call, of course.


>>         The backquote and <dollar-sign> characters shall follow the
>>         same rules as for characters in double-quotes described in
>>         this section.
>>
>> This requires that command substitutions, parameter expansions, and
>> arithmetic expansions work as they usually do within double quotes.
>
> I assume that the above text, where it says that <dollar-sign> is
> handled specially does *not* include $', right?
> In other words, a $' ... ' within the word of a ${var-word} expansion,
> but only if that whole expansion itself is enclosed in " " would be
> taken literally any not as dollar-single-quotes quoting.
>
> Perhaps that should be added to the text in POSIX?

Earlier in Section 2.2.3, it already says (describing behavior
within double quotes):

        The <dollar-sign> shall retain its special meaning introducing
        parameter expansion (see Section 2.6.2), a form of command
        substitution (see Section 2.6.3), and arithmetic expansion
        (see Section 2.6.4), but shall not retain its special meaning
        introducing the dollar-single-quotes form of quoting (see
        Section 2.2.4).

So "[the] backquote and <dollar-sign> characters shall follow the
same rules as for characters in double-quotes" already precludes
"${foo-$'bar'}" from working because "...$'...'..." doesn't work
(where by "work" I mean "treat $'...' specially").


> I'd have said this makes sense now:
> In the # # % %% cases, the word is handled like it's own independent
> word where quoting is performed, so there the ' ... ' would be single
> quotes quoting, and thus removed.
> But in the :- + ? = cases, the ' ' are, by the rules you cited,
> literal, and the $bar is of course expanded within the outer " ... ".
>
> So now, it seems to be perfectly reasonable?

Yeah.  One way to think about it is that in the - + ? = forms,
there isn't much of a need to treat the word differently from
anything else enclosed in double quotes.  But in the # ## % %%
forms, the word is used for pattern matching, so it wouldn't make
sense to treat it as if it were in double quotes.  That would
hamstring all the special matching characters.


> So long story short I would summarize as follows:
> 1) With # ## % %
> - The word (RHS of # ## % %) does it's own unquoting/quote removal
> step, independent from any outer " " that surrounds the parameter
> expansion.
> - The various quoting styles within word, work as normal, e.g. '$bar"
> will case the literal $bar while "$bar" will cause bar to be expanded
> - Any special character (for pattern matching notation, like * ? [   )
> that was directly quoted (like in \* or '*' or "*", looses it's
> special meaning. Similarly if such character results from a double
> quoted variable expansion like in wo_"$var"_rd it also looses the
> special meaning, but not if it results from a bare unquoted variable
> expansion within the word part.
>
> 2) With :+ :- + - etc. forms:
> - The word part (the RHS of the :+ etc) behaves like as if it would be
> directly within a double quoted string (and no surrounding ${var:+
> }, with the exception that \ also escapes } as a literal character.
> Other than that \ escapes " $ ` \ to their literal value. An unescaped
> (by \) $ introduces expansions/substitutions. The quoting characters '
> and $' are taken literally.
> - Within the word part (the RHS of the :+ etc), a " cannot portably be
> used unless preceded by an escaping backslash.
>
> Does that seem right?

Looks good to me.


> Thanks for your help guys :-)

No worries, happy to help.


-- 
vq



reply via email to

[Prev in Thread] Current Thread [Next in Thread]