bug-bash
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RFC: changing printf(1) behavior on %b


From: Eric Blake
Subject: RFC: changing printf(1) behavior on %b
Date: Thu, 31 Aug 2023 10:35:59 -0500
User-agent: NeoMutt/20230517

In today's Austin Group call, we discussed the fact that printf(1) has
mandated behavior for %b (escape sequence processing similar to XSI
echo) that will eventually conflict with C2x's desire to introduce %b
to printf(3) (to produce 0b000... binary literals).

For POSIX Issue 8, we plan to mark the current semantics of %b in
printf(1) as obsolescent (it would continue to work, because Issue 8
targets C17 where there is no conflict with C2x), but with a Future
Directions note that for Issue 9, we could remove %b entirely, or
(more likely) make %b output binary literals just like C.  But that
raises the question of whether the escape-sequence processing
semantics of %b should still remain available under the standard,
under some other spelling, since relying on XSI echo is still not
portable.

One of the observations made in the meeting was that currently, both
the POSIX spec for printf(1) as seen at [1], and the POSIX and C
standard (including the upcoming C2x standard) for printf(3) as seen
at [3] state that both the ' and # flag modifiers are currently
undefined when applied to %s.

[1] https://pubs.opengroup.org/onlinepubs/9699919799/utilities/printf.html
"The format operand shall be used as the format string described in
XBD File Format Notation[2] with the following exceptions:..."

[2] 
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap05.html#tag_05
"The flag characters and their meanings are: ...
# The value shall be converted to an alternative form. For c, d, i, u,
  and s conversion specifiers, the behavior is undefined.
[and no mention of ']"

[3] https://pubs.opengroup.org/onlinepubs/9699919799/functions/printf.html
"The flag characters and their meanings are:
' [CX] [Option Start] (The <apostrophe>.) The integer portion of the
  result of a decimal conversion ( %i, %d, %u, %f, %F, %g, or %G )
  shall be formatted with thousands' grouping characters. For other
  conversions the behavior is undefined. The non-monetary grouping
  character is used. [Option End]
...
# Specifies that the value is to be converted to an alternative
  form. For o conversion, it shall increase the precision, if and only
  if necessary, to force the first digit of the result to be a zero
  (if the value and precision are both 0, a single 0 is printed). For
  x or X conversion specifiers, a non-zero result shall have 0x (or
  0X) prefixed to it. For a, A, e, E, f, F, g, and G conversion
  specifiers, the result shall always contain a radix character, even
  if no digits follow the radix character. Without this flag, a radix
  character appears in the result of these conversions only if a digit
  follows it. For g and G conversion specifiers, trailing zeros shall
  not be removed from the result as they normally are. For other
  conversion specifiers, the behavior is undefined."

Thus, it appears that both %#s and %'s are available for use for
future standardization.  Typing-wise, %#s as a synonym for %b is
probably going to be easier (less shell escaping needed).  Is there
any interest in a patch to coreutils or bash that would add such a
synonym, to make it easier to leave that functionality in place for
POSIX Issue 9 even when %b is repurposed to align with C2x?

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




reply via email to

[Prev in Thread] Current Thread [Next in Thread]