make-alpha
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Output quoting (was: Re: Possible solution for special characters in mak


From: Paul Smith
Subject: Output quoting (was: Re: Possible solution for special characters in makefile paths)
Date: Wed, 09 Apr 2014 09:41:21 -0400

On Sun, 2014-03-02 at 11:38 -0500, Paul Smith wrote:
> On Thu, 2014-02-20 at 03:22 -0500, Paul Smith wrote:
> > Hi all.

I must once again apologize for the lengthy silence.  I realize it's
difficult to jump back into these threads after a long time away and
reacquire the knowledge that one lost in the meantime.  All I can say is
that this spring I have had even less free time than expected.

I will be responding to the other two threads on the alternatives for
encoding, but in this sub-thread I wanted to discuss output quoting
options.

By "output quoting" I mean taking make's internal encoded representation
of a string and performing two operations on it: first "decoding" and
then "quoting".  The result is that when it's handed to an entity
external to make, that entity interprets it as the user expects.  Either
of those two operations (decode or quote) might be no-ops depending on
the encoded format and the output format expected.

What "the user expects" may vary depending on the context in which the
value is used, and it's quite possible that make cannot always determine
that context on its own (for example, make cannot parse the recipe).

The "external entity" could be the operating system (a filename given to
the $(file ...) function for example), the Guile and C APIs, or the
POSIX shell or any other interpreter that can be set with the SHELL make
variable.  Note that we could use any or all of these _in the same
instance of make_ (SHELL could be set differently per-target with
target-specific variables for example).

My goals are that all possible use scenarios are covered, somehow, and
that the rules for how make will perform output quoting in any situation
are as clear and simple and understandable as possible.


When sending the results to the operating system (fopen etc.) we don't
need the quoting step: we just decode.  What's not clear is how to treat
strings that contain multiple words separated by un-encoded whitespace:
is the entire string considered a single word because it's used in a
"single word context", and distinctions between encoded and un-encoded
whitespace ignored?  Or does make throw an error here?  For example if
you have this (using backslash escaping in this example):

  FILENAME = foo\ bar biz\ baz

  $(file >$(FILENAME),hi)

Does this create a file 'foo bar biz baz'?  Or give an error because two
words were provided where one was expected?  Or something else?  Today
what you'll get is the file named 'foo\ bar biz\ baz', including
backslashes, unfortunately (at least on UNIX).  I'm not sure we can
avoid breaking backward-compatibility here.  That doesn't make me sad in
this case.


For the APIs (Guile and C), we will need to provide a method in the API
that will decode without quoting, and return the resulting string.  We
will need to ensure that some method of splitting is available since
decoding a string will lose information (by decoding whitespace, for
example, without quoting it again).


The interpreters are where things get hairy.  The problems are many.
First, there's the fact that we don't know how to quote properly for any
random interpreter: they all have different rules.  They even have
different characters which are considered "special" and in need of
quoting!

Second, even for the POSIX sh interpreter which we could quote for, we
don't know the context that the value is being used in.  It could
already be quoted, and our added quoting will actually break things; for
example if the makefile contains the rule:

  %.x: %.y
          process '$<' -o '$@'

which works fine today, then any quoting we try to apply to $< and $@
will simply break the makefile.

And finally even if SHELL is a POSIX sh we don't know that the value is
being interpreted by the shell.  _Make_ might run the shell but the
shell script may be invoking sed or awk or Perl or Python or Ruby or
whatever, which may have their own quoting rules.  For example:

  %.x : %.y
          perl -e "open(X, '$<'); ..."

Here again, any quoting we try to add based on _shell_ quoting rules
will cause this to fail, since it's actually used in a Perl script.


I know there's a strong desire to have make "just do the right thing"
when it comes to quoting for the different interpreters, or at least for
the default POSIX sh interpreter.  But I really feel like the results
will be very difficult to understand and use in all but the most
straightforward situations, and will introduce a lot of backward
compatibility issues.

I'm interested to hear other opinions.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]