bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: changing printf(1) behavior on %b


From: Martin D Kealey
Subject: Re: RFC: changing printf(1) behavior on %b
Date: Sat, 2 Sep 2023 01:28:55 +1000

<devils_advocate>If compatibility with C is really that important,
shouldn't we be fixing %c? Its current behaviour as a synonym for %.1s
doesn't provide significant utility, and arguably differs from C's "take an
int and output the corresponding single byte", not "take the first byte of
a string and output that".
</devils_advocate>

Whilst I wouldn't object to adding %#s (or %#b for that matter), I'm
uncomfortable about changing existing behaviour, especially when it's just
for the sake of linguistic simplicity in the standard.)

Plenty of projects have functions that accept a format string and pass it
through to printf (sometimes with names like warnf, errorf, panicf); it
would be non trivial to locate indirect format string parameters. An
estimate of "a few years" is WAY short of the timeframe needed to weed out
old usage; embedded devices typically run the same version of bash from the
time they leave the factory until they reach the scrap disassembly plant
(or landfill) a decade or more later.

One of the benefits of printf over echo is that there aren't two mutually
incompatible ways of interpreting the data; this would take us back to the
bad old days of having to dynamically select the format string depending on
which version of the Shell the script is running under.

Please no.

-Martin

On Fri, 1 Sept 2023 at 01:35, Eric Blake <eblake@redhat.com> wrote:

> In today's Austin Group call, we discussed the fact that printf(1) has
> mandated behavior for %b (escape sequence processing similar to XSI
> echo) that will eventually conflict with C2x's desire to introduce %b
> to printf(3) (to produce 0b000... binary literals).
>
> For POSIX Issue 8, we plan to mark the current semantics of %b in
> printf(1) as obsolescent (it would continue to work, because Issue 8
> targets C17 where there is no conflict with C2x), but with a Future
> Directions note that for Issue 9, we could remove %b entirely, or
> (more likely) make %b output binary literals just like C.  But that
> raises the question of whether the escape-sequence processing
> semantics of %b should still remain available under the standard,
> under some other spelling, since relying on XSI echo is still not
> portable.
>
> One of the observations made in the meeting was that currently, both
> the POSIX spec for printf(1) as seen at [1], and the POSIX and C
> standard (including the upcoming C2x standard) for printf(3) as seen
> at [3] state that both the ' and # flag modifiers are currently
> undefined when applied to %s.
>
> [1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html
> "The format operand shall be used as the format string described in
> XBD File Format Notation[2] with the following exceptions:..."
>
> [2]
> https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html#tag_05
> "The flag characters and their meanings are: ...
> # The value shall be converted to an alternative form. For c, d, i, u,
>   and s conversion specifiers, the behavior is undefined.
> [and no mention of ']"
>
> [3] https://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html
> "The flag characters and their meanings are:
> ' [CX] [Option Start] (The <apostrophe>.) The integer portion of the
>   result of a decimal conversion ( %i, %d, %u, %f, %F, %g, or %G )
>   shall be formatted with thousands' grouping characters. For other
>   conversions the behavior is undefined. The non-monetary grouping
>   character is used. [Option End]
> ...
> # Specifies that the value is to be converted to an alternative
>   form. For o conversion, it shall increase the precision, if and only
>   if necessary, to force the first digit of the result to be a zero
>   (if the value and precision are both 0, a single 0 is printed). For
>   x or X conversion specifiers, a non-zero result shall have 0x (or
>   0X) prefixed to it. For a, A, e, E, f, F, g, and G conversion
>   specifiers, the result shall always contain a radix character, even
>   if no digits follow the radix character. Without this flag, a radix
>   character appears in the result of these conversions only if a digit
>   follows it. For g and G conversion specifiers, trailing zeros shall
>   not be removed from the result as they normally are. For other
>   conversion specifiers, the behavior is undefined."
>
> Thus, it appears that both %#s and %'s are available for use for
> future standardization.  Typing-wise, %#s as a synonym for %b is
> probably going to be easier (less shell escaping needed).  Is there
> any interest in a patch to coreutils or bash that would add such a
> synonym, to make it easier to leave that functionality in place for
> POSIX Issue 9 even when %b is repurposed to align with C2x?
>
> --
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.
> Virtualization:  qemu.org | libguestfs.org
>
>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]