Re: M4 syntax $11 vs. ${11}

m4-patches
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: M4 syntax $11 vs. ${11}

From:	Eric Blake
Subject:	Re: M4 syntax $11 vs. ${11}
Date:	Fri, 02 Mar 2007 07:07:47 -0700
User-agent:	Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.0.10) Gecko/20070221 Thunderbird/1.5.0.10 Mnenhy/0.7.4.666
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

[re-adding m4-discuss; this affects the long-term direction that GNU m4
2.0 is moving towards]

Hi Gary,

According to Gary V. Vaughan on 3/2/2007 12:28 AM:
> Hi Eric,
> 
> Wow, I can't believe how much you've done in the last few months! :)  It's
> gonna take me a week or two to get back up to speed on everything...

No problem; I don't see any risk of m4 2.0 coming out until a lot of
features have settled, so I can afford to wait for a good review; and even
if we do release 1.9b to alpha.gnu.org, it is with explicit documentation
that experimental things may be withdrawn prior to 2.0.

> 
> Okay, give me a few days (it's Octavia's birthday in an hour, and mine
> the day
> after so I won't be online too much until Sunday (-:).

Enjoy your birthday celebrations!

> One thing that worries me about where M4 2.0 is headed is that the default
> build seems to me like it will be too different from 1.4.x and suffer from
> a lot of the problems and bad feeling that surrounded the transition from
> Autoconf 2.13 to 2.5x :-(  We can avoid this if we take care... why
> shouldn't
> people who are stuck in 2.13 land be able to upgrade to M4 2.0 without
> pain?

I've been trying to ensure that all along.  Which is why I had settled on
the changeextarg idea (although if you can get changesyntax to understand
multi-byte sequences, it would be a use of changesyntax instead of adding
changeextarg).  The idea is that in autoconf, the first thing it does is
disable ${1} so that it behaves the same as in 1.4.x, but enables ${{1}}
as the way to become an early adopter of m4 2.0 extended arguments (very
few, if any, autoconf scripts currently contain "${{").  I'm still
polishing a short patch to autoconf that does just this (my last
submission didn't really have any conceptual problems, just technical in
how autom4te was invoking too many processes:
http://lists.gnu.org/archive/html/autoconf-patches/2007-02/msg00013.html).
 It also seems that very few autoconf scripts use $10 to mean the tenth
argument; most people just don't write macros that take that many
arguments.  POSIX requires $10 to mean the first argument concatenated
with 0, and I would rather minimize the places in code that depend on
whether POSIXLY_CORRECT is set.

Maybe there is still a way we can let the user dynamically select whether
$10 means the POSIX interpretation or the tenth argument?  My thoughts
here are that in default 2.0 mode, { and } are the two strings for
extended argument sequences, $10 has the POSIX meaning, and we are
compliant to all POSIX requirements whether or not POSIXLY_CORRECT is set.
 And by setting the strings to empty, then $10 looks like an extended
argument, restoring 1.4.x behavior without violating POSIX (because the
only way to change the default syntax table is to use an extension to
POSIX, and once you have used an extension, POSIX rules are no longer
binding).  Of course, with empty extended argument delimiters, we have to
be careful of ambiguity (for example, ${1-default} will work when the
delimiters are {}, but $1-default must be parsed as the first argument
concatenated with -default rather than the full string treated as an
extended argument when the delimiters are empty).

> Maybe some of them will find all the cool new functionality we already have
> in 2.0 and start using it then before giving up in disgust and downgrading
> to 1.4.x when a bunch of their configure.ins stop building...

That's been my worry all along.  I have (hopefully) been very careful to
ensure that running autoconf with 1.9b sees very little impact.

> 
> I'm loathe to add the changeextarg builtin, when we should be able to
> incorporate the functionality into changesyntax.  I haven't thought too
> hard about what arguments changesyntax should take to provide changeextarg
> (and changecom & changequote) functionality, but I (hope I still) have a
> half baked patch to push the multicharacter syntax implementation into the
> big m4 syntax table, which in turn removes a lot of special case code and
> opens up some new possibilities for other multicharacter syntaxes at the
> cost of slowing the parser down some.  I'll either dust it off or rewrite
> it and post for review when I've shepherded my libtool patch queue backlog
> into CVS.

Fair enough - improving changesyntax is a good goal in itself, especially
if it lets us not have to need changeextarg.

> 
>> 3) I would like to implement ideas from sh, such as ${1-default}
>> expanding to
>> the first argument if supplied, or `default' if omitted.
> 
> I do like this idea a lot.  But I'd be sad to see it pollute 100% 1.4.x
> backward compatibility :-(

I'm aware of that too.  That's why I have been floating the idea of
changeextarg - either you have non-empty extarg delimiters, and can get
powerful argument expansions, or you disable extarg, and get 1.4.x
behavior of literal output when ${ is encountered.  And you can always use
nested quoting to avoid an extarg even when enabled, just like what is
already used by autoconf for literal shell argument outputs: $`'{1} vs. $`'1.

> 
> I also have some parser extension patches that implemented named
> arguments with define:
> 
>   define(`foo(bar, baz)', `
>     ${bar}, ${baz}!
>   ')
>   foo(`hello', `world')
>   =>hello, world!

Looks similar to m5.  And almost works great with extended arguments -
POSIX specifically reserved all uses of ${ for the implementation, and
leaves the interpretation of '(' in a macro name up in the air (since most
traditional implementations reject it; without indir, there would be no
way to invoke such a macro).  And  I have specifically not done anything
with non-numeric extended arguments yet, because I had the goal of
supporting named arguments like that.  The only problem is that autoconf
currently defines macros named "foo(bar)" with the parenthesis as part of
the name for purposes of data storage (knowing that the macro won't be
directly invoked), so we would need a way to select whether () is part of
the macro name or delimits the named arguments to a shorter macro name.

Maybe the syntax should be:
define(`foo{bar}', `${bar}')
foo(`hello')
=>hello

with the same delimiters for naming arguments as what is used in accessing
those extended arguments?  A quick grep of autoconf doesn't turn up any
macros with {} in their name.

> 
> And argument defaulting:
> 
>   define(`foo(bar, baz=`cruel world')', `
>     ${bar}, ${baz}!
>   ')
>   foo(`Goodbye')
>   =>Goodbye, cruel world!

My patches so far sort of crippled the = assign syntax class (there
weren't enough classes with only 16 bits; maybe we shift the syntax table
to use 32 bits for classifications?).  But that is still possible with
this syntax:

define(`foo{bar, baz}', `${bar}, ${baz-`cruel world'}!')
foo(`Goodbye')
=>Goodbye, cruel world!

> 
> At the time, I think Akim wanted a more perl-like syntax, where I had
> followed a more lisp-like path.  Since then M5 was pointed out to me,
> so I guess this work can support M5 syntax and be the first steps to
> implementing the M5 `language'.

I just barely found and read the m5 language documentation
(http://techreports.lib.berkeley.edu/accessPages/CSD-91-621.html).  It had
lots of ideas for extended arguments (IIRC, stuff like ${,2,*} meaning
output a leading comma if anything else is output, then output from $2 to
$n a comma-separated list of only non-empty arguments, except that it used
$() instead of ${}).  It also had the idea of $() vs. $'' for whether the
argument expansion was quoted, so that instead of providing $* vs. $@, it
provided $(*) vs. $'*'.  Hmm, maybe that means we need two sets of
extended argument delimiters?  By default, only ${} is POSIX compliant,
but a user could also enable the second one to get $'' that quotes the
expansions as in M5.

Other features I liked of m5 - the notion that a single $ starts argument
parsing, but $$ expands to $, $$$ expands to $$, and so on, so that it is
easier to write macros that define macros.  I also liked the idea of pools
(or namespaces), such that it is easy to add or hide a pool of macros.

> 
>> I think that 2) is the only thing that should be completed before I feel
>> comfortable baselining m4-1.9b for wider test exposure on alpha.gnu.org.
> 
> I'm more than happy to see a 1.9b alpha release as soon as you feel ready,

It's not ready until I can run the autoconf testsuite with no regressions
with rudimentary extended argument support enabled.

> but a showstopper for 2.0 proper is that on a crummy machine that worked
> well with 1.4.x, I think 2.0 must have a default preloaded module build,
> that is 100% backwards compatible.  Any extensions can either be runtime
> loaded on a decent machine, or at worst have additional code configured in
> at build time on a static only architecture.
> 
> My dream is still to modularise the core enough that m4-2.0 will support
> different 'languages' (we need a better name for this -- I mean GNU M4
> being a 1.4.x compatible, M42 being the new stuff we're putting in that
> will break 1.4.x compatibility, M5 etc).

I'm still not sure that I'm breaking much 1.4.x compatibility.  And what
is breaking is justified by a move closer to POSIX, and will be documented
in the manual of how to easily restore the former behavior.  I agree that
we could go so far as to build a module 'm4-14x' vs the existing 'gnu'
module, so that from the command-line, 'm4 --gnu' loads all 2.0 features,
'm4 --14x' loads only features in 1.4.x, and 'm4' loads whichever of the
two modules was selected to be the default at ./configure time.  I just
think that the new features are backwards-compatible enough that we won't
have to distinguish between the two, or that anywhere semantics change
between the two, we also provide aids such as --warn-macro-sequence that
the user can enable to quickly find their problematic uses of the
ambiguous syntax, along with good documentation of how to get the desired
behavior, regardless of whether 1.4.x or 2.0 is parsing that code.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFF6C+y84KuGfSFAYARAj4rAKDUgnBKxMav5C9gO+xfD0+D5BdFMgCgyV1u
AJiAJSJcNEwUPEs80ogB3zs=
=qkKn
-----END PGP SIGNATURE-----
[Prev in Thread]
Current Thread
[Next in Thread]
Re: M4 syntax $11 vs. ${11}, Gary V. Vaughan, 2007/03/02
- Re: M4 syntax $11 vs. ${11}, Eric Blake <=
  - Re: M4 syntax $11 vs. ${11}, Gary V. Vaughan, 2007/03/02
- Re: M4 syntax $11 vs. ${11}, Gary V . Vaughan, 2007/03/13
Prev by Date: Re: M4 syntax $11 vs. ${11}
Next by Date: Re: M4 syntax $11 vs. ${11}
Previous by thread: Re: M4 syntax $11 vs. ${11}
Next by thread: Re: M4 syntax $11 vs. ${11}
Index(es):
- Date
- Thread