make-alpha
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: New escape method proposal (was: Re: Possible solution for special c


From: Paul Smith
Subject: Re: New escape method proposal (was: Re: Possible solution for special characters in makefile paths)
Date: Tue, 22 Apr 2014 15:00:27 -0400

Hi all.  I hope to have some time to spend on implementation for this in
the near future, so if you have any last points or suggestions to make,
please make them! :-)

On Sun, 2014-03-09 at 20:10 -0400, Paul Smith wrote:
> On Sun, 2014-03-02 at 11:38 -0500, Paul Smith wrote:
> > On Thu, 2014-02-20 at 03:22 -0500, Paul Smith wrote:
> > > Hi all.
> > 
> > Thanks for participating so far.

> This thread is for discussion of the original encoding proposal I made,
> which was to introduce a new quoting capability, for example $[...] or
> $'...'; however a function like $(quote ...) would also work of course.

This message will describe the full design for a proposed "new quoting
syntax" in GNU make.  This is based on the original quoting proposal I
made in February.

This is the last major email by me on this topic.  The rest will be a
final decision (this model or the backslash model) and discussion of
specific, smaller issues of interest.

Below when we speak of "special characters" we mean characters special
to make, not characters special to the current SHELL.  That includes at
least: SPACE, TAB, $, :, =, \, and , (comma).

For the purposes of this email I will use $[...] as the quoting
sequence, however see the discussion below for alternatives.  As always,
"$" introducing the quoting sequence can be escaped, so "$$[...]" is not
quoting.


The quoting sequence works like a function.  When a string containing it
is expanded, first its contents are expanded.  This allows values like
"$[$(DIR)]" to behave as expected.  It does mean that users still need
to be careful with "$" inside quoting; they'll need to double it as
usual.  The $(value ...) function may also prove useful in this context.

After expansion, all special characters in the resulting string will be
replaced with encoded values so that those characters no longer match
the "normal" versions of those characters: e.g., the quoted space will
not match SPACE (ASCII 32).

The resulting encoded string will be compatible with any character
encoding which uses ASCII-compatible meanings for characters codes
0-127.

One way to state the desired outcome is that if someone uses a variable
containing an encoded string in an argument to $(eval ...), nothing in
that string will be interpreted by make as "special" regardless of where
the variable appears in the evaluated string.


In addition to makefile contents, all results of the $(wildcard ...),
$(abspath ...), $(realpath ...) function are directly encoded, as is the
value of the CURDIR predefined variable, and any goals provided on the
make command line.  It's possible we'll encode other variables obtained
from the environment (PATH?) as well; we'll have to see.


When make constructs a value to be provided to an external interface it
would follow these rules:

Environment: For variables special to make, such as MAKEFLAGS, we will
decode the values back into $[...] format.  For other variables we will
decode the special characters to their native character values.

OS interface (fopen, etc.): first the string is split into words, then
the words are decoded (converting special characters back into native
characters).

APIs (C & Guile): a special function is provided to convert an encoded
string into a decoded string.  We won't need any special function to
split into words: users can use any functions that tokenize on
whitespace.  If it seems useful we can provide our own "split" function.

"Fast path" recipe invocation: first the string is split into words,
then each word is decoded, then the result is sent to fork/exec.

POSIX-y shell invocation: once the string is fully expanded, then the
entire string will be decoded and the result sent to system().  If we
decide to try to automatically add quoting, it would be done BEFORE the
decode step so we can tell the difference between (for example) spaces
that are part of words and ones that are not.

non-POSIX shell invocation: behaves the same way as above: first the
string is expanded, then it's decoded and the result is sent to the
interpreter.


There are suggestions for adding quoting to the results for the value of
SHELL, but this is a separate step which won't be discussed here.


ALTERNATIVE:

My original idea for this is that $[...] would be interpreted really
early, perhaps even in the readline() function in read.c, so before make
even begins to parse it.  That is cleaner in some ways, but restrictive
in other ways; for example it means that something like this:

  F = $[$(DIR)/$(FILE)]

will not do what you might, at first glance, expect; it's actually not
possible to apply quoting to anything but a static string.  If you
wanted to do something like the above you'd have to use $(eval ...), so
that the result goes back through the makefile parser:

  $(eval F = $$[$(DIR)/$(FILE)])

I decided that this would be more confusing than the above method, and
that the behavior of "functions" in make is already well-enough
understood that the function-like $[...] would not be too confusing.
However I'm open to alternative points of view.


BACKWARD COMPATIBILITY ISSUES:

The only anticipated backward-compatibility issue is that whatever
syntax we use for quoting will no longer be available for makefiles.
For the options below it would mean that a single-character make
variable which is currently legal would become illegal.  Make can
detect this (by looking for assignments to a variable with that name)
and fail, to make porting straightforward.


QUOTING SYNTAX OPTIONS:

Due to the free-form syntax of makefiles it's actually difficult to add
a quoting sequence.  The only true reserved namespace for make is the
"$" and the characters it introduces.  Therefore I think the quoting
syntax must be started with "$".

Any alphanumeric value plus "_" following the "$" is required by POSIX
to be interpreted as a variable.  The special characters "(", "{", "@",
"<", "?", "*", "^", "+", "%", and "|" are already used.  I think that
the single-variable names "-", ".", "~", and "/", are not that uncommon
in makefiles so I don't want to make them illegal.

The options available appear to be (please let me know if I missed a
better one):

  $[...]
  $`...`
  $'...'
  $"..."
  $\...\
  $#...#
  $:...:
  $!...!
  $&...&

One thing to keep in mind is that whatever sequence we choose we're
making the "close character" more complicated to add to a quoted name: I
don't want to try adding some kind of "meta-escape" character that is
valid inside the quoted string.

For example, if we choose $"..." and we want to quote a name that
contains a " character, say 'fo o"ba r', it must be written:

   F = $"fo o""$"ba r"

Or else use a variable to "hide" the delimiter, like:

   Q = "
   F = $"fo o$(Q)ba r"

This makes me a bit leery of the quote characters like ' and ", and also
backslash and colon, as it seems like these might be more common in
filenames (backslash for Windows paths?).

A different idea, that would introduce NO backward compatibility issues
at the expense of some verbosity, would be to make the quoting
capability a function instead, so something like:

  $(quote ...)

or $(q ...) or $(' ...) or whatever.  A downside of this is that it
obeys the various parsing rules of functions; so for example the closing
delimiter is now "special" and must be escaped somehow during parsing.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]