[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Backslash quoting (was: Re: Possible solution for special characters
From: |
Paul Smith |
Subject: |
Re: Backslash quoting (was: Re: Possible solution for special characters in makefile paths) |
Date: |
Sun, 13 Apr 2014 00:53:10 -0400 |
On Mon, 2014-03-10 at 00:42 -0400, Paul Smith wrote:
> On Sun, 2014-03-02 at 11:38 -0500, Paul Smith wrote:
> > On Thu, 2014-02-20 at 03:22 -0500, Paul Smith wrote:
> > > Hi all.
> >
> > Thanks for participating so far.
>
> This thread is for discussing the alternative quoting proposal, put
> forth by Eli. This proposal would take the current handling of
> backslashes to quote special characters (see my previous post on this
> subject) and extend it.
I'd like to follow this proposal through from beginning to end with a
complete implementation and finally decide if it's really viable. I'll
put forth a proposed design; if I've missed anything or anyone has
alternative design (given the issues discussed already) please reply.
Also for the time being I'll not discuss any automated interpreter
quoting capability (i.e., automatically applied SHELL_QUOTE), because
I'm not sure we've really understood how that might work yet (see my
last message to Frank).
TL;DR: I feel that the overloading of the backslash character, which is
already special to the interpreter (SHELL), to also be special to make
in this way, leads to a lot of difficult choices and confusing
behaviors.
Below when we speak of "special characters" we mean characters special
to make, not characters special to the current SHELL. That includes at
least: SPACE, TAB, \, :, =, and , (comma). I'm not sure about "$": we
already have a way to quote that ($$). Is it better to be more
consistent, but redundant, and also allow \$? Or to avoid the potential
confusion of two escape methods and always use $$?
> Each special character that we'd like to quote would be prefixed with
> a backslash in the makefile. Backslashes would quote themselves.
During makefile parsing, make does nothing different than today.
Strings containing backslashes would be stored internally to make
without modification.
Values obtained from the operating system where we knew they should be
single words (e.g., goals on the command line and files returned by
$(wildcard ...)) would be modified to add backslashes before any special
characters.
All make functionality that needs to parse strings would be modified to
not treat backslash followed by the special character as special. so
for example, "\ " would not be considered a word-delimiting space.
However, "\\ " would be a backslash followed by a normal,
word-delimiting space.
ALTERNATIVE: During makefile parsing if a backslash followed by
a special character was found, it would be internalized as an
alternative binary value. So, "\ " would be encoded as some
value that was not recognized as a SPACE (ASCII 32). Values
from the operating system would be treated similarly. This
means no internal make functionality regarding parsing strings
into words would need to be changed, and would probably be more
performant.
However, before make can provide a string back to an external
interface the encoded values must be converted back into
backslash-escaped versions. This is because make cannot know
whether the backslash in the variable REALLY intended to escape
a special character, or if it was merely part of a recipe to be
passed to the shell.
When make constructs a value to be provided to an external interface it
would follow these rules:
Environment: For variables special to make, such as MAKEFLAGS, we must
keep the backslashes. For other variable values we have a problem.
Usually we'll want to remove backslashes (for example, setting PATH or
LD_LIBRARY_PATH). However it could be that someone wants to export a
shell script as an environment variable value; in that case removing the
backslashes will break things. I don't know how to handle this one;
we'll just have to pick one way and users will have to work around it.
OS interface (fopen, etc.): first the string is split into words, then
backslashes quoting special characters are removed from the individual
words.
APIs (C & Guile): special functions are provided to split strings into
words while ignoring the backslash-escaped special characters, and to
remove backslashes from strings.
"Fast path" recipe invocation: first the string is split into words,
then backslashes quoting special characters are removed from the
individual words, then it's sent to fork/exec. I'd need to examine the
fast-path algorithm to see whether backslashes appearing in the string
have an impact on the decision to use the fast-path or not.
POSIX-y shell invocation: the string is left as-is, and passed to
system() or the equivalent. Because we know that make's backslash
quoting rules are a strict subset of the POSIX shell rules, it's OK to
pass along the string without doing anything about the backslashes. If
we figure out how to do automated quoting (SHELL_QUOTE) we'd do it
here... or not.
non-POSIX shell invocation: this is a major issue. Because we cannot
really know which backslashes are present in the string solely to escape
characters that are special to make versus which are meaningful to the
interpreter, and because the recipe writer needs some way of telling the
difference between escaped spaces (for example) and non-escaped, we
cannot remove any backslashes. This means that non-POSIX shell users
have a difficult situation. They first need to remove any backslashes
they don't want from variable values in the make recipe, while
preserving those they do want, and preserving the distinction between
different words, then they can add their own quoting.
ALTERNATIVE: We could try to handle values we know result in
single words specially, for example the results of $@ and $<,
and $(word ...), the iterator value in $(foreach ...), etc., by
removing backslashes from those. However, I think that may be
more confusing than just doing them all the same way. Or, maybe
we could introduce a new make function similar to $(foreach ...)
but using un-escaped versions of each word in the list as the
iterator values.
BACKWARD COMPATIBILITY ISSUES: (I'm not suggesting that these are
deal-breakers, but it's important to be clear about everything).
Today backslashes are ignored by make functions. So today the
following:
all: ; @echo '$(addsuffix .x,foo\ bar)'
will print "foo\.x bar.x". After the above change, it would print
"foo\ bar.x"
The current behavior of backslashes quoting special characters (as per
http://lists.gnu.org/archive/html/make-alpha/2014-03/msg00000.html )
would change; today the backslashes are removed in automatic variables,
as in:
foo\:bar: ; @echo '$@'
gives "foo:bar"; after this change it would give "foo\:bar"... unless we
implemented the special ALTERNATIVE for output formats above, and
treated $@ and $< specially. Even then, the values of $^ and $? would
change.
There may be others, I got tired :-).
- Re: Backslash quoting (was: Re: Possible solution for special characters in makefile paths),
Paul Smith <=