[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: CVS autoconf testsuite failure on Tru64 unix
From: |
Noah Misch |
Subject: |
Re: CVS autoconf testsuite failure on Tru64 unix |
Date: |
Wed, 15 Dec 2004 02:19:48 -0800 |
User-agent: |
Mutt/1.5.6i |
On Wed, Dec 15, 2004 at 12:00:56AM -0800, Paul Eggert wrote:
> Noah Misch <address@hidden> writes:
> Thanks for looking into this, but I have to admit that I'm wayyy lost.
> I don't have a clue as to what's going on here. I vaguely understand
> that you're trying to predict how a shell command will behave before
> you actually execute it, but I don't know why this is important.
I'll try to elaborate. My apologies for the impending overdesign splurge.
When I discovered that `./testsuite -x' caused tests to fail spuriously (19 of
the ~190 tests that existed at the time), I wanted to fix that. The problem was
test command sequences with individual commands spanning multiple lines:
$ echo 'foo
bar'
=> stdout
foo
bar
=> stderr
+ foo
bar
Most tests examine their stderr, if only to confirm that it is empty. Under
`-x', the test suite filters ^+ from the stderr before comparing with the stderr
the command expects. This technique fails with commands like the one above
since only the first line of the trace has a leading +. (Of course, a test
command that looks for ^+ in its stderr will also fail under `-x'; that bothers
me less because (a) it's easy to guess what's happening and (b) no test in the
Autoconf test suite does this.)
The task at hand was therefore to determine which commands may produce traces
spanning multiple lines and which definitely will not do so. It was the goal of
my patch from October that touched this region to disable shell tracing for the
former class of commands.
Parameters that expand to add a newline make the shell trace span multiple lines
as readily as their literal counterparts, as do quoted command substitutions.
More about command substitutions below. I could have disabled shell tracing for
all tests containing parameter expansions, but that would have affected about
10% of the Autoconf test suite. I thought I could do better by expanding those
parameter expansions and looking for newlines in the expanded commands text.
Now wait! Only two Autoconf tests use constructs that look like `${...}', and
only one of those actually expands the parameter; the other is single-quoted and
yields the syntax error that began this discussion. It would not be harmful to
disable shell tracing for all such tests. Yes, that will be _so_ much cleaner.
I'll post a patch separately. Thanks for not applying this one :-)
I already wrote the rest of this message, and it may be interesting, so I will
not delete it. If you just like the just-described plan, feel free to exec
&>/dev/null and move on.
> That being said, is it really true that you can predict whether a
> shell expansion will contain a newline, without possibly introducing
> harmful side effects? For example, how would you know what the
> following will expand to, without executing the "rm"?
>
> y=${x-`rm foo; echo bar`}
No, one cannot predict in the presence of command substitutions. As such, the
code summarily disables shell tracing for commands that contain them.
Even so, the code does expand the AS_ESCAPE_FOR_EXPAND of that command; there
is strictly no need to do so, but it simplified the code slightly. Since
AS_ESCAPE_FOR_EXPAND escapes the backticks, this is the relevant line:
at_cmd_expanded="y=${x-\`rm foo; echo bar\`}"
> > +# Escape all `"', ``', and `\' in STRING. Possibly escape some `$', but
> > do not
> > +# escape any parameter expansion that the shell could expand when it
> > interprets
> > +# STRING. `foo="AS_ESCAPE_FOR_EXPAND(STRING)"' shall be a valid POSIX shell
> > +# command. If STRING is not a valid shell command, the result is
> > unspecified.
>
> Can this comment be worded as follows?
>
> If STRING is a valid shell command, then
> `foo="AS_ESCAPE_FOR_EXPAND(STRING)"' is a valid shell command that has
> no side effect other than assigning to "foo".
A simple AS_ESCAPE(STRING) meets that contract. AS_ESCAPE_FOR_EXPAND(STRING)
must not escape any parameter expansion that the shell might expand while
interpreting STRING, lest the expansion of one of them contain a newline.
That is not to say that this comment could not be clearer.
> > +# This implementation leaves valid parameter expansions active
> > +# regardless of whether backslashes or single-quotes will keep the
> > +# shell from ever expanding them. It escapes all invalid parameter
> > +# expansions; if the shell were to ever expand those, STRING would
> > +# also be invalid.
>
> But this makes it sound like STRING can be any string, and that STRING
> need not be a valid shell command. So I guess I'm still lost.
The `clean' way to implement the specification of this macro is to parse the
commands like the shell does, tracking the quotation state and escaping all
escaped and single-quoted parameter expansions and all other characters active
within double-quotes. I could not implement that in M4 in less than 150 lines,
plus comments. I implemented it in shell in 50 source lines. If we add that
code to every AT_CHECK invocation, that will add over 50000 lines to the
Autoconf test suite! If we call it at `make testsuite' time through m4_esyscmd,
the time to generate `testsuite' grows by 50%. I did not want to penalize the
non-`testsuite -x'-user majority that dramatically.
For simplicity, this implementation cheats. It escapes all characters that are
not literal within double-quotes, including all `$'. Then, it unescapes all
character sequences that form valid parameter expansions. This ``incorrectly''
unescapes parameters in some cases:
AS_ESCAPE_FOR_EXPAND('$x') => $x
AS_ESCAPE_FOR_EXPAND(\$x) => \\$x
(Note that the existing code escaped the second example ``correctly'', but not
the first example.)
Fortunately, the incorrectness does not matter for the purpose of testing
whether any parameter expansions yields multiple lines. In the above examples,
the excess parameter expansion will yield the empty string or some stale value.
Algorithmically, this implementation does not care if STRING is valid shell
code, but requiring that allows the macro to guarantee that it will not escape a
parameter expansion the shell will interpret. It assumes that invalid parameter
expansions are somehow escaped or quoted, and therefore it escapes them; if that
assumption is wrong, STRING must be invalid shell code and the result is
undefined anyway.
Your comments do lead me to notice of a bug; even though we will not generally
use them, this macro should escape `$(...)' command substitutions. That's a
one-character fix.
> > +# AS_RE_PARAMETER
> > +# ---------------
> > +# The definition of AS_RE_PARAMETER is a regular expression that only
> > matches
> > +# every valid POSIX shell parameter.
> > +m4_define([AS_RE_PARAMETER],
> > +[\([A-Za-z_][A-Za-z0-9_]*\|[0-9]+\|address@hidden)])
>
> "that only matches every valid POSIX shell parameters"?
> Is that equivalent to "that matches valid POSIX shell parameters"?
That sounds like it could match things that are not valid parameters.
Nonetheless, maybe simple clarity is more important than rigor for comments
preceding undocumented macros.
> That "?-*" looks suspicious. * precedes ? in the ASCII encoding.
> Presumably you meant address@hidden Also (to be pedantic)
Oh, yes.
> A-Z and a-z aren't portable to EBCDIC; you have to spell
> them out, e.g., with m4_cr_letters.
I saw that and wondered the reason. Thanks.
> > +# The definition of AS_RE_PARAMETER_EXP is a regular expression
> > +# that matches many valid POSIX shell parameter expansions and does
> > +# not match any invalid POSIX shell parameter expansions. This
> > +# implementation matches only `${...}' expansions, because only
> > +# they can form grammatically invalid shell.
>
> Is it accurate to say that it matches any shell parameter
> expansion that begins with "${"? If so, that would be a shorter
> way to say it.
I was trying to leave open the possibility that a future implementation would
match all parameter expansions. That's hogwash, though.
Thanks for scrutinizing this change.