[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bug in latest awk release
From: |
Paul Eggert |
Subject: |
Re: bug in latest awk release |
Date: |
Thu, 15 Sep 2005 12:06:59 -0700 |
User-agent: |
Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux) |
Stepan Kasal <address@hidden> writes:
> So in your example:
> FS="\t" {print $1}
> the "condition" is the assignment to FS. It is always true, so all
> lines are printed. But the FS is changes only when the condition is
> evaluated. This means that the first line is split according to
> the default FS.
Unfortunately that is a minor POSIX-conformance bug in gawk.
The POSIX spec for awk
<http://www.opengroup.org/onlinepubs/009695399/utilities/awk.html>
under EXTENDED DESCRIPTION states:
Before the first reference to a field in the record is evaluated,
the record shall be split into fields, according to the rules in
Regular Expressions, using the value of FS that was current at the
time the record was read. Each pattern in the program then shall
be evaluated in the order of occurrence,...
Therefore, if the pattern changes FS, the input record must still be
split according to the previous FS. Since Gawk doesn't do this, it
doesn't conform to POSIX here.
I doubt whether this departure from Unix practice and from POSIX was
intended, so it's just a minor gawk bug.
Incidentally, I tested this example with Solaris 10 /bin/awk, Solaris
10 /bin/nawk, Solaris 10 /usr/xpg4/bin/awk, and Gawk 3.1.5. Only
/bin/nawk conformed to POSIX. So, even though the script conforms to
POSIX, I wouldn't use constructs like that in awk scripts that are
intended to be portable.