make-alpha
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Possible solution for special characters in makefile paths


From: Paul Smith
Subject: Re: Possible solution for special characters in makefile paths
Date: Fri, 21 Feb 2014 21:32:32 -0500

Thanks for your thoughts Frank!

On Sat, 2014-02-22 at 01:23 +0100, Frank Heckenbach wrote:
> Some random comments:
> 
> > The advantages to this are (a) there is no change to the length of the
> > string so the encoding can be performed in-place, and computing the size
> > of an output buffer is trivial (it's the same size), and (b) there is no
> > change needed to any existing tokenization in make, which is scanning
> > for whitespace, parenthesis, braces, nul bytes, etc.: it will all
> > continue to work with no changes.
> 
> An alternative, of course, would be a multi-character encoding, with
> the advantage that all characters can be encoded uniquely (same
> principle as UTF-8).

I'm not sure I see.  The reason I suggested those characters for the
mapping characters is that most variable-length locale encodings
preserve the ASCII values for the first 127 characters, and there are
values in that range which are not used, or hardly ever used (for
example, not part of the Portable Character Set defined by POSIX), in a
text file such as a makefile or in pathnames.

Once you get beyond the first 127 characters it seems to me that the
similarities drop off precipitously.

If we decide to translate SPC (for example) to some other value, even a
multi-character value, then we have to find either a value which has no
valid meaning in any encoding, or else determining what locale we are
using and choosing invalid values for that locale in particular.

Neither of these sound straightforward to me.

But maybe I misunderstood your suggestion?

> > We could make an attempt to fix them by modifying
> > the built-in rules to use something like:
> >
> > %.o : %.c
> >     $(COMPILE.o) -c '$<' -o '$@'
> >
> > Of course this fails if a target or prerequisite contains single quotes.
> 
> I should mention, as you're surely aware, that this problem already
> exists WRT some other (non-whitespace) special characters, e.g.:

Unquestionably true.  I expect that many people writing more complex
makefiles define their own rules, perhaps partly for this reason.

> > I have no 100% solution to these problems, other than the hope that
> > paths containing single-quotes are far less abundant than paths
> > containing whitespace (for example).
> 
> That's most likely the case. However, the problem with this kind of
> argument is that as soon as the common case (whitespace) is handled,
> some will, consciously or not, simply assume "special characters
> work now" and get sloppy and not properly test such cases and sooner
> or later produce some actually dangerous code ...

Perhaps, but seriously... I have no solution for this problem :-).

No matter what we choose to do to help users it all goes straight out
the door the instant they write their own recipe, or change SHELL.

The only thing I can think of is trying to introduce some sort of
equivalent of Perl's exec() function or similar, but this would be so
limited compared to what people typically want to do in a recipe I'm not
sure it's worthwhile.

Or people could start writing their recipes in Guile :-).

> > 1. Any makefile that uses one of the chosen mapping characters will
> >    fail.  We can detect this during makefile parsing and throw an
> >    error, so this will not be a silent problem.
> 
> For my progress messages (cf.
> http://lists.gnu.org/archive/html/bug-make/2013-04/msg00060.html), I
> do indeed use \001 as an indicator (for basically the same reasons
> you suggest to use it for encoding). It wouldn't be a major problem
> for me to switch to some other indicator, so that's just to say that
> based on my sample of size 1, such things do occur, portable or not.

Well, with a userbase the size of GNU make's, no change has zero impact.
However, it is a bit troubling that we've already found a counter-case.

> This indicates that environment variables containing the mapping
> characters need to cause failure as well in order to maintain strict
> backward-compatibility.

Yes, excellent point.  When we import variables from the environment
we'd need to check them as well.

> > 2. Any makefile using the chosen "quoting" token will break; i.e.
> >    if some makefile today has "[ = foo" then uses "$[" later, and
> >    we choose $[...] for quoting, this will fail.  It would have to
> >    be changed to use "$([)" instead.  Same for ` if we choose
> >    $`...` etc.
> 
> FWIW, I grepped my Makefiles for "$[" and the only occurrences I
> found were bash arithmetic expansions like this:

What about $` ?

> SHELL=/bin/bash
> foo:
>       n=3; echo $$[$$n + 1]
> 
> I suppose that's harmless, since the "$" here is escaped and
> wouldn't be parsed as part of "$[". Still, it would be worthwhile to
> add a test of this kind if you implement this ...

Yes, this is harmless.  I'm not sure what you mean by "add a test of
this kind"...?  You mean, add something to the regression test suite?
Or add a test while reading makefiles?

The way we'd test this would be to watch variable assignments and check
if the variable name was the new quoting token.

Whether we do this or not depends, to me, on whether we want to allow
users to use that variable name with the caveat that they must use
parentheses or braces around any references.  For example, this could be
considered OK because it's not ambiguous:

    [ = foo
    X := ${[}

While this would not be correct (probably give an error about malformed
quoting sequence or something):

    [ = foo
    X := $[

If we want to allow this form, then I wouldn't want to warn on the "[ =
foo" line, at least by default.  Maybe if a warnings flag were enabled.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]