make-alpha
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Output quoting (was: Re: Possible solution for special characters in


From: Frank Heckenbach
Subject: Re: Output quoting (was: Re: Possible solution for special characters in makefile paths)
Date: Thu, 10 Apr 2014 12:56:19 +0200

Paul Smith wrote:

> I must once again apologize for the lengthy silence.  I realize it's
> difficult to jump back into these threads after a long time away and
> reacquire the knowledge that one lost in the meantime.  All I can say is
> that this spring I have had even less free time than expected.

I'm also quite busy these weeks, so I'll just reply to a few points.

> When sending the results to the operating system (fopen etc.) we don't
> need the quoting step: we just decode.  What's not clear is how to treat
> strings that contain multiple words separated by un-encoded whitespace:
> is the entire string considered a single word because it's used in a
> "single word context", and distinctions between encoded and un-encoded
> whitespace ignored?  Or does make throw an error here?  For example if
> you have this (using backslash escaping in this example):
> 
>   FILENAME = foo\ bar biz\ baz
> 
>   $(file >$(FILENAME),hi)
> 
> Does this create a file 'foo bar biz baz'?  Or give an error because two
> words were provided where one was expected?  Or something else?

I'd say an error. We should be clear about what a string means and
if a given string (like the above, or however else it will be
written) is meant to contain multiple words and used in a
single-word context, that's simply wrong and we should just tell the
user so. Trying to interpret things differently depending on context
is partly what got make into this predicament (WRT backslash
handling), and it can be quite confusing, e.g.:

$(FILENAME): ; $(file >$(FILENAME),hi)

Define rules for two targets ("foo bar" and "biz baz") which
actually create a totally different file ("foo bar biz baz")?
No, better just error out.

> Second, even for the POSIX sh interpreter which we could quote for, we
> don't know the context that the value is being used in.  It could
> already be quoted, and our added quoting will actually break things; for
> example if the makefile contains the rule:
> 
>   %.x: %.y
>           process '$<' -o '$@'
> 
> which works fine today, then any quoting we try to apply to $< and $@
> will simply break the makefile.

It works fine today, as long as the file name doesn't contain
spaces, or single quotes ...

My point is that it's not possible(*) to get such a rule to work
correctly for all possible file names. Therefore I suggested a way
to quote things so they'll work in an unquoted sh context. This
would break the above rule for file names containing e.g. double
quotes or some other special characters, but since it already fails
with spaces and single quotes, I'd consider it broken anyway. It
will continue to work for "trivial" file names where no quoting is
needed anyway.

(*) without very strange contortions, such as "quoting" ' to '\''
    where the first ' ends the (assumed) outer ' quoting, the \'
    produces the actually quoted ', and the final ' starts a new
    quoting to match the outer terminating '. But that's quite
    backwards, terminating a quoting that someone else started. You
    don't normally do such kinds of things except in SQL injection
    attacks ... ;)

> And finally even if SHELL is a POSIX sh we don't know that the value is
> being interpreted by the shell.  _Make_ might run the shell but the
> shell script may be invoking sed or awk or Perl or Python or Ruby or
> whatever, which may have their own quoting rules.  For example:
> 
>   %.x : %.y
>           perl -e "open(X, '$<'); ..."
> 
> Here again, any quoting we try to add based on _shell_ quoting rules
> will cause this to fail, since it's actually used in a Perl script.

I think it's not possible to make this work for all possible file
names, so it has to be rewritten anyway. I don't do much perl, but
in awk e.g. you can define variables in separate arguments, lifting
the quoting level:

  awk "{ getline < 'foo' }"

can be changed to

  awk -v filename='foo' "{ getline < filename }"

so you're back at the simple shell quoting level. Likewise, if for
some reason you want to call sh explicitly, instead of

  sh -c "echo 'foo'"

you can write:

  sh -c 'echo "$1"' -- 'foo'

So I think this burden must fall on the user, if they want support
for unlimited character sets, to avoid additional quoting levels.
All make can provide is a way to reliably get a name into a
top-level command-line, from there on it's up to the user to
preserve it.

> I know there's a strong desire to have make "just do the right thing"
> when it comes to quoting for the different interpreters, or at least for
> the default POSIX sh interpreter.  But I really feel like the results
> will be very difficult to understand and use in all but the most
> straightforward situations, and will introduce a lot of backward
> compatibility issues.

Of course, for other interpreters than sh, as you said it's in
general not even possible for make to detect when they're used, so
all it can do is provide a way for the user to explicitly select
different kinds of quoting (cf. my SHELL_QUOTE proposal). Any
attempt to automatically detect it in some (but necessarily not all)
situations would indeed be very difficult to understand IMO.

Eli Zaretskii wrote:

> > From: Paul Smith <address@hidden>
> > Date: Wed, 09 Apr 2014 09:41:21 -0400
> > 
> > The interpreters are where things get hairy.  The problems are many.
> > First, there's the fact that we don't know how to quote properly for any
> > random interpreter: they all have different rules.  They even have
> > different characters which are considered "special" and in need of
> > quoting!
> 
> We've been there during past discussions.  My opinion is: decode the
> string, and let the user quote it if needed.  This way, rules like this:
> 
> >   %.x: %.y
> >           process '$<' -o '$@'
> 
> continue to work (or not ;-) as they did before.

If we just want to have everything that works to continue working
and everything that doesn't to continue not working, we don't need
to change anything. ;)

As I detailed before, it will be possible for the user to quote
everything manually (if make's internal functions will support
multi-word strings), but quite cumbersome, which in practice means
users won't do it. (I mean, compared to what would be needed in
make, quoting variables in shell scripts is easy, just add "" around
your variables, yet I still see so many shell scripts failing to do
so.)

So I say if it's left to the user, make should at least provide a
way to do this automatically and *by default*. Again, see my
SHELL_QUOTE proposal (which would be automatically applied). If the
user doesn't want quoting (or needs a different kind of quoting for
another interpreter), they can unset or change SHELL_QUOTE,
respectively, but for the normal case (sh-compatible interpreter),
they shouldn't need to do anything.



reply via email to

[Prev in Thread] Current Thread [Next in Thread]