make-alpha
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Backslash quoting (was: Re: Possible solution for special characters


From: Paul Smith
Subject: Re: Backslash quoting (was: Re: Possible solution for special characters in makefile paths)
Date: Sun, 13 Apr 2014 00:53:10 -0400

On Mon, 2014-03-10 at 00:42 -0400, Paul Smith wrote:
> On Sun, 2014-03-02 at 11:38 -0500, Paul Smith wrote:
> > On Thu, 2014-02-20 at 03:22 -0500, Paul Smith wrote:
> > > Hi all.
> > 
> > Thanks for participating so far.
> 
> This thread is for discussing the alternative quoting proposal, put
> forth by Eli.  This proposal would take the current handling of
> backslashes to quote special characters (see my previous post on this
> subject) and extend it.

I'd like to follow this proposal through from beginning to end with a
complete implementation and finally decide if it's really viable.  I'll
put forth a proposed design; if I've missed anything or anyone has
alternative design (given the issues discussed already) please reply.
Also for the time being I'll not discuss any automated interpreter
quoting capability (i.e., automatically applied SHELL_QUOTE), because
I'm not sure we've really understood how that might work yet (see my
last message to Frank).

TL;DR: I feel that the overloading of the backslash character, which is
already special to the interpreter (SHELL), to also be special to make
in this way, leads to a lot of difficult choices and confusing
behaviors.

 
Below when we speak of "special characters" we mean characters special
to make, not characters special to the current SHELL.  That includes at
least: SPACE, TAB, \, :, =, and , (comma).  I'm not sure about "$": we
already have a way to quote that ($$).  Is it better to be more
consistent, but redundant, and also allow \$?  Or to avoid the potential
confusion of two escape methods and always use $$?

> Each special character that we'd like to quote would be prefixed with
> a backslash in the makefile.  Backslashes would quote themselves.

During makefile parsing, make does nothing different than today.
Strings containing backslashes would be stored internally to make
without modification.

Values obtained from the operating system where we knew they should be
single words (e.g., goals on the command line and files returned by
$(wildcard ...)) would be modified to add backslashes before any special
characters.

All make functionality that needs to parse strings would be modified to
not treat backslash followed by the special character as special.  so
for example, "\ " would not be considered a word-delimiting space.
However, "\\ " would be a backslash followed by a normal,
word-delimiting space.

        ALTERNATIVE: During makefile parsing if a backslash followed by
        a special character was found, it would be internalized as an
        alternative binary value.  So, "\ " would be encoded as some
        value that was not recognized as a SPACE (ASCII 32).  Values
        from the operating system would be treated similarly.  This
        means no internal make functionality regarding parsing strings
        into words would need to be changed, and would probably be more
        performant.
        
        However, before make can provide a string back to an external
        interface the encoded values must be converted back into
        backslash-escaped versions.  This is because make cannot know
        whether the backslash in the variable REALLY intended to escape
        a special character, or if it was merely part of a recipe to be
        passed to the shell.

When make constructs a value to be provided to an external interface it
would follow these rules:

Environment: For variables special to make, such as MAKEFLAGS, we must
keep the backslashes.  For other variable values we have a problem.
Usually we'll want to remove backslashes (for example, setting PATH or
LD_LIBRARY_PATH).  However it could be that someone wants to export a
shell script as an environment variable value; in that case removing the
backslashes will break things.  I don't know how to handle this one;
we'll just have to pick one way and users will have to work around it.

OS interface (fopen, etc.): first the string is split into words, then
backslashes quoting special characters are removed from the individual
words.

APIs (C & Guile): special functions are provided to split strings into
words while ignoring the backslash-escaped special characters, and to
remove backslashes from strings.

"Fast path" recipe invocation: first the string is split into words,
then backslashes quoting special characters are removed from the
individual words, then it's sent to fork/exec.  I'd need to examine the
fast-path algorithm to see whether backslashes appearing in the string
have an impact on the decision to use the fast-path or not.

POSIX-y shell invocation: the string is left as-is, and passed to
system() or the equivalent.  Because we know that make's backslash
quoting rules are a strict subset of the POSIX shell rules, it's OK to
pass along the string without doing anything about the backslashes.  If
we figure out how to do automated quoting (SHELL_QUOTE) we'd do it
here... or not.

non-POSIX shell invocation: this is a major issue.  Because we cannot
really know which backslashes are present in the string solely to escape
characters that are special to make versus which are meaningful to the
interpreter, and because the recipe writer needs some way of telling the
difference between escaped spaces (for example) and non-escaped, we
cannot remove any backslashes.  This means that non-POSIX shell users
have a difficult situation.  They first need to remove any backslashes
they don't want from variable values in the make recipe, while
preserving those they do want, and preserving the distinction between
different words, then they can add their own quoting.

        ALTERNATIVE: We could try to handle values we know result in
        single words specially, for example the results of $@ and $<,
        and $(word ...), the iterator value in $(foreach ...), etc., by
        removing backslashes from those.  However, I think that may be
        more confusing than just doing them all the same way.  Or, maybe
        we could introduce a new make function similar to $(foreach ...)
        but using un-escaped versions of each word in the list as the
        iterator values.


BACKWARD COMPATIBILITY ISSUES:  (I'm not suggesting that these are
deal-breakers, but it's important to be clear about everything).

Today backslashes are ignored by make functions.  So today the
following:

   all: ; @echo '$(addsuffix .x,foo\ bar)'

will print "foo\.x bar.x".  After the above change, it would print
"foo\ bar.x"


The current behavior of backslashes quoting special characters (as per
http://lists.gnu.org/archive/html/make-alpha/2014-03/msg00000.html )
would change; today the backslashes are removed in automatic variables,
as in:

  foo\:bar: ; @echo '$@'

gives "foo:bar"; after this change it would give "foo\:bar"... unless we
implemented the special ALTERNATIVE for output formats above, and
treated $@ and $< specially.  Even then, the values of $^ and $? would
change.

There may be others, I got tired :-).




reply via email to

[Prev in Thread] Current Thread [Next in Thread]