bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

awk portability notes


From: Ralf Wildenhues
Subject: awk portability notes
Date: Sun, 03 Dec 2006 10:04:51 +0100

The Autoconf change to use portable awk in config.status made us aware
of a number of portability issues that could be documented in gawk.texi
(that I haven't found there yet).
1) Solaris awk does not support the syntax `if (index in array)', but
only `for (index in array)'.  SVID describes this feature; does that
imply that SVR3.1 (or SVR4) awk had it? How can I find out for sure?
2) Solaris awk does not support regexps as value of `FS', which is
documented in the V7/SVR3.1 node.  However, it may be useful to know
that this awk accepts a string value for `FS', of which only the first
character is important.
3) `$0' is not assignable in Solaris awk, and `$ 0' is not the same as
`$0' for it; see this message for more information:
http://lists.gnu.org/archive/html/autoconf-patches/2006-11/msg00048.html
4) next is defined by POSIX; the awkcard seems to imply otherwise by
color.

Further, the Autoconf manual currently lists a number of issues to be
expected with some awk implementations, not all of which are listed in
the gawk manual.  For reference here are the items that I did not find
equivalent information in gawk.texi for.  Would you be interested in
listing them as well, and if yes, where?
|   Some Awk implementations, such as HP-UX 11.0's native one,
|   mishandle anchors:
|
|        $ echo xfoo | $AWK '/foo|^bar/ { print }'
|        $ echo bar | $AWK '/foo|^bar/ { print }'
|        bar
|        $ echo xfoo | $AWK '/^bar|foo/ { print }'
|        xfoo
|        $ echo bar | $AWK '/^bar|foo/ { print }'
|        bar
|
|   Either do not depend on such patterns (i.e., use `/^(.*foo|bar)/',
| or use a simple test to reject such implementations.
|   AIX version 5.2 has an arbitrary limit of 399 on the length of
| regular expressions and literal strings in an Awk program.
|   Traditional Awk has a limit of 99 fields in a record.  You may be
| able to circumvent this problem by using `split'.
|   Traditional Awk has a limit of at most 99 bytes in a number
|   formatted by `OFMT'; for example, `OFMT="%.300e"; print 0.1;'
| typically dumps core.
|   The original version of Awk had a limit of at most 99 bytes per
|   `split' field, 99 bytes per `substr' substring, and 99 bytes per
|   run of non-special characters in a `printf' format, but these bugs
| have been fixed on all practical hosts that we know of.

Here's a suggested patch for the list above.  I don't have access to old
awks other than the Solaris one, so corrections are very welcome.
Cheers,
Ralf
doc/ChangeLog:
2006-12-03 Ralf Wildenhues <address@hidden>
        * awkcard.in: next is POSIX.
        * gawk.texi: V7/SVR3.1: Mention assignable `$0', `var in index'
as expression. Specify `FS' limitation.
Index: doc/awkcard.in
===================================================================
RCS file: /sources/gawk/gawk-stable/doc/awkcard.in,v
retrieving revision 1.1.1.1
diff -u -r1.1.1.1 awkcard.in
--- doc/awkcard.in      11 Aug 2006 12:05:48 -0000      1.1.1.1
+++ doc/awkcard.in      3 Dec 2006 08:52:54 -0000
@@ -1168,7 +1168,7 @@
.br
co-process pipe into \*(FCgetline\*(FR; set \*(FIv\*(FR.
.ti -.2i
-\*(FCnext\fP
+\*(CD\*(FCnext\fP
.br
stop processing the current input
record. Read next input record and
Index: doc/gawk.texi
===================================================================
RCS file: /sources/gawk/gawk-stable/doc/gawk.texi,v
retrieving revision 1.5
diff -u -r1.5 gawk.texi
--- doc/gawk.texi       15 Sep 2006 13:49:28 -0000      1.5
+++ doc/gawk.texi       3 Dec 2006 08:53:18 -0000
@@ -22776,10 +22776,17 @@
and @code{SUBSEP} built-in variables (@pxref{Built-in Variables}).
@item
+Assignable @code{$0}.
+
address@hidden
The conditional expression using the ternary operator @samp{?:}
(@pxref{Conditional Exp}).
@item
+The expression @address@hidden in @var{array}} outside of @samp{for}
+statements (@pxref{Reference to Elements}).
+
address@hidden
The exponentiation operator @samp{^}
(@pxref{Arithmetic Ops}) and its assignment operator
form @samp{^=} (@pxref{Assignment Ops}).
@@ -22792,7 +22799,8 @@
Regexps as the value of @code{FS}
(@pxref{Field Separators}) and as the
third argument to the @code{split} function
-(@pxref{String Functions}).
+(@pxref{String Functions}), rather than using only the first character
+of @code{FS}.
@item
Dynamic regexps as operands of the @samp{~} and @samp{!~} operators




reply via email to

[Prev in Thread] Current Thread [Next in Thread]