[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: printf incompatibilities with POSIX, ksh93
From: |
Paul Eggert |
Subject: |
Re: printf incompatibilities with POSIX, ksh93 |
Date: |
Tue, 30 Sep 2003 11:21:55 -0700 |
User-agent: |
Gnus/5.1002 (Gnus v5.10.2) Emacs/21.3 (usg-unix-v) |
At Tue, 30 Sep 2003 11:33:50 -0400, Chet Ramey <chet@nike.ins.cwru.edu> writes:
> I don't know which version of ksh93 you tested this against, but the
> latest available version (ksh.2003-07-24) doesn't behave like you describe:
I tested ksh93 Version M-12/28/93d, which is installed as 'dtksh' on Solaris 9.
I've just been in email contact with the ksh maintainers David Korn
and Glenn Fowler. They confirm that the ksh93 behavior was changed
some time after version d. So I withdraw that part of the patch.
Sorry about the confusion.
Here is a revised patch:
* At most three octal digits are allowed in printf string octal escapes,
to conform to POSIX. Previously, Bash allowed four digits
if the first one was '0'.
* New escape sequences \" and \? are now recognized in printf strings,
for compatibility with the C standard and with ksh93.
===================================================================
RCS file: builtins/printf.def,v
retrieving revision 2.5.2.0
retrieving revision 2.5.2.2
diff -pu -r2.5.2.0 -r2.5.2.2
--- builtins/printf.def 2002/05/13 18:36:04 2.5.2.0
+++ builtins/printf.def 2003/09/30 17:35:06 2.5.2.2
@@ -30,7 +30,9 @@ characters, which are simply copied to s
sequences which are converted and copied to the standard output, and
format specifications, each of which causes printing of the next successive
argument. In addition to the standard printf(1) formats, %b means to
-expand backslash escape sequences in the corresponding argument, and %q
+expand backslash escape sequences in the corresponding argument (except
+that \c terminates output, backslashes in \', \", and \? are not removed,
+and octal escapes that start with \0 can have up to four digits), and %q
means to quote the argument in a way that can be reused as shell input.
$END
@@ -105,7 +107,7 @@ extern int errno;
static void printf_erange __P((char *));
static void printstr __P((char *, char *, int, int, int));
-static int tescape __P((char *, int, char *, int *));
+static int tescape __P((char *, char *, int *));
static char *bexpand __P((char *, int, int *, int *));
static char *mklong __P((char *, char *, size_t));
static int getchr __P((void));
@@ -186,9 +188,9 @@ printf_builtin (list)
if (*fmt == '\\')
{
fmt++;
- /* A NULL fourth argument to tescape means to not do special
- processing for \c. */
- fmt += tescape (fmt, 1, &nextch, (int *)NULL);
+ /* A NULL third argument to tescape causes it to bypass
+ the special processing for %b arguments. */
+ fmt += tescape (fmt, &nextch, (int *)NULL);
putchar (nextch);
fmt--; /* for loop will increment it for us again */
continue;
@@ -531,6 +533,7 @@ printstr (fmt, string, len, fieldwidth,
/* Convert STRING by expanding the escape sequences specified by the
POSIX standard for printf's `%b' format string. If SAWC is non-null,
+ do the processing appropriate for %b arguments. In particular,
recognize `\c' and use that as a string terminator. If we see \c, set
*SAWC to 1 before returning. LEN is the length of STRING. */
@@ -540,11 +543,11 @@ printstr (fmt, string, len, fieldwidth,
value. *SAWC is set to 1 if the escape sequence was \c, since that means
to short-circuit the rest of the processing. If SAWC is null, we don't
do the \c short-circuiting, and \c is treated as an unrecognized escape
- sequence. */
+ sequence; also we bypass the other processing that is needed only for
+ %b arguments. */
static int
-tescape (estart, trans_squote, cp, sawc)
+tescape (estart, cp, sawc)
char *estart;
- int trans_squote;
char *cp;
int *sawc;
{
@@ -576,19 +579,18 @@ tescape (estart, trans_squote, cp, sawc)
case 'v': *cp = '\v'; break;
- /* %b octal constants are `\0' followed by one, two, or three
- octal digits... */
- case '0':
- /* but, as an extension, the other echo-like octal escape
- sequences are supported as well. */
- case '1': case '2': case '3': case '4':
- case '5': case '6': case '7':
- for (temp = 2+(c=='0'), evalue = c - '0'; ISOCTAL (*p) && temp--; p++)
+ /* The octal escapes are \0 followed by up to 3 octal digits (if SAWC)
+ or \ followed by up to 3 octal digits (if !SAWC). As an extension,
+ we allow the latter form even if SAWC. */
+ case '0': case '1': case '2': case '3':
+ case '4': case '5': case '6': case '7':
+ evalue = OCTVALUE (c);
+ for (temp = 2 + (!evalue && !!sawc); ISOCTAL (*p) && temp--; p++)
evalue = (evalue * 8) + OCTVALUE (*p);
*cp = evalue & 0xFF;
break;
- /* And, as another extension, we allow \xNNN, where each N is a
+ /* And, as another extension, we allow \xHH, where each H is a
hex digit. */
case 'x':
for (temp = 2, evalue = 0; ISXDIGIT ((unsigned char)*p) && temp--; p++)
@@ -606,8 +608,9 @@ tescape (estart, trans_squote, cp, sawc)
*cp = c;
break;
- case '\'': /* TRANS_SQUOTE != 0 means \' -> ' */
- if (trans_squote)
+ /* !SAWC means \' -> ', and similarly for \" and \?. */
+ case '\'': case '"': case '?':
+ if (!sawc)
*cp = c;
else
{
@@ -657,7 +660,7 @@ bexpand (string, len, sawc, lenp)
continue;
}
temp = 0;
- s += tescape (s, 0, &c, &temp);
+ s += tescape (s, &c, &temp);
if (temp)
{
if (sawc)
===================================================================
RCS file: doc/bash.1,v
retrieving revision 2.5.2.0
retrieving revision 2.5.2.1
diff -pu -r2.5.2.0 -r2.5.2.1
--- doc/bash.1 2002/07/15 19:21:03 2.5.2.0
+++ doc/bash.1 2003/09/26 20:24:50 2.5.2.1
@@ -6939,7 +6939,10 @@ format specifications, each of which cau
\fIargument\fP.
In addition to the standard \fIprintf\fP(1) formats, \fB%b\fP causes
\fBprintf\fP to expand backslash escape sequences in the corresponding
-\fIargument\fP, and \fB%q\fP causes \fBprintf\fP to output the corresponding
+\fIargument\fP (except that \fB\ec\fP terminates output, backslashes
+in \fB\e'\fP, \fB\e"\fP, and \fB\e?\fP are not removed, and octal
+escapes that start with \fB\e0\fP can have up to four digits),
+and \fB%q\fP causes \fBprintf\fP to output the corresponding
\fIargument\fP in a format that can be reused as shell input.
.sp 1
The \fIformat\fP is reused as necessary to consume all of the \fIarguments\fP.
===================================================================
RCS file: doc/bashref.texi,v
retrieving revision 2.5.2.0
retrieving revision 2.5.2.1
diff -pu -r2.5.2.0 -r2.5.2.1
--- doc/bashref.texi 2002/07/15 19:21:24 2.5.2.0
+++ doc/bashref.texi 2003/09/26 20:24:50 2.5.2.1
@@ -3254,7 +3254,10 @@ format specifications, each of which cau
@var{argument}.
In addition to the standard @code{printf(1)} formats, @samp{%b} causes
@code{printf} to expand backslash escape sequences in the corresponding
-@var{argument}, and @samp{%q} causes @code{printf} to output the
+@var{argument} (except that @samp{\c} terminates output, backslashes
+in @samp{\'}, @samp{\"}, and @samp{\?} are not removed, and octal
+escapes that start with @samp{\0} can have up to four digits),
+and @samp{%q} causes @code{printf} to output the
corresponding @var{argument} in a format that can be reused as shell input.
The @var{format} is reused as necessary to consume all of the @var{arguments}.