bug-gnu-utils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gawk 3.1.2 bug


From: Stepan Kasal
Subject: Re: gawk 3.1.2 bug
Date: Wed, 23 Apr 2003 11:47:16 +0200
User-agent: Mutt/1.2.5.1i

Hello,

thank you for your bug report.

However, I'd say that the neither of the two problems you describe
is bug of gawk.  Let me explain:

On Fri, Apr 18, 2003 at 10:46:46AM -0400, Kast, Scott wrote:
>   istxt = sub("\.t.t$", "", rr[7])  # rip ".t?t" off
> gawk-3.1.2: cmd. line:11: warning: escape sequence `\.' treated as plain `.'

The first argument of sub() is regular expression.  Thus it's better
to write it as /\.t.t$/ .

When you specify a string instead of reg. exp., the string is converted
to regex.  Thus "\\.t.t$" is a _six_ character string equivalent to the
above regex.  "\.t.t$" is the same string as ".t.t$" and does not yield
the desired regexp.

It's best not to use strings as regex arguments, if the regex is constant,
ie. it's known at the time you write the script.

So in this case, gawk-3.1.2 helped you to fix code which had different
semantics then the one you had in mind.

The other problem is more subtle:
>   "head " $0 " | grep \"<META REPORT-INFO=\"" | getline title
> sh: -c: line 1: syntax error near unexpected token `|'
> sh: -c: line 1: ` | grep "<META REPORT-INFO="'

First, the workaround: it's not necessary to use a variable: it's
enough to use parentheses:
     ("head " $0 " | grep \"<META REPORT-INFO=\"") | getline title

You meant it this way and the good old version understood your intention.
On the other hand, the command

     ret = 1 + "ls" | getline f

is probably meant as

     ret = 1 + ("ls" | getline f)

This is weird when you consider that normally string concatenation has
lower precedence then the addition.

So it's not easy to distinguish the cases by the means of formal grammar.

That's why POSIX says that the evaluation of the expression above is
undefined and you should always use parentheses [1], as I have shown.

Backward compatibility:
OK, so gawk-3.1.2 is correct in the sense of strict definition of the
awk language.  But why has any change been made and the backward
compatibility was broken?

These changes were part of an overall grammar cleanup which made the
grammar more comprehensive and fixed some subtle bugs (unusual cases
when gawk was not functioning properly).  And I think it's not easy
to get the above case right without breaking something else.

So we have to change our habits.  The additional plus is that the code
will be more portable to other awks.

Thank you again for your bug reports.  If you encounter other
problems, please let us know.

Footnote:
[1] POSIX definition of awk says:
"The getline operator can form ambiguous constructs when
there are unparenthesized operators (including concatenate) to
the left of the '|' (to the beginning of the expression
containing getline). [...] The result of evaluating [...] is
unspecified, and conforming applications shall parenthesize
properly all such usages."
See http://www.opengroup.org/onlinepubs/007904975/utilities/awk.html

Have a nice day,
        Stepan Kasal




reply via email to

[Prev in Thread] Current Thread [Next in Thread]