m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: M4 syntax $11 vs. ${11}


From: Gary V . Vaughan
Subject: Re: M4 syntax $11 vs. ${11}
Date: Tue, 13 Mar 2007 05:47:57 +0000

Hi Eric,

On 1 Mar 2007, at 04:18, Eric Blake wrote:
Here (finally) is a patch for head, that both implements ${1}, as well as
forward ports --warn-macro-sequence from the branch.  The features are
intertwined enough that I didn't see any good reason to separate this into multiple patches (other than the earlier patches I already committed today).

Here (finally) is the review of the patch for HEAD :-) It's been a crazy
couple of weeks, apologies for the delay.

For ease of review and future CVS archaeology it would be better to keep
patches as small and self contained as possible though...  i.e. this is
at least 3 patches: --warn-macro-sequence; posix $ syntax; regex function movement. Unless the split for this patch falls naturally out of reworking
it, don't worry too much in this particular case though.

However, I would like a review; so it is not applied yet. In particular, in macro.c, this patch does not do a deprecation period for $10, as I originally
thought above, but flat out went with POSIX syntax.  I did this on the
principle that relatively few uses of $10 in the wild have been discovered, and
that --warn-macro-sequence can be used to detect even those uses.

I think this is okay for the dev branch, so long as it also contains a TODO to make sure that the default build of released M4 will maintain 100% bugwards
compatibility with the gnu syntax of 1.4.x.

Still to be written:

1) I want to implement a new builtin m4macroseq([regex], [resyntax]), which behaves like the command-line --warn-macro-sequence (and that also means adding a command line --m4macroseq for symmetry). With no arguments, it enables the default warning sequence, with one empty argument, it disables warnings, and since it uses regular expressions, it takes an optional resyntax argument to
override the current changeresyntax.

I think this functionality belongs either in a module (I've been wanting to create a way for modules to add and parse command line options for some time), or maybe a separate helper script for upgraders. Either way, there is no need
to complicate the core with any of the above.

2) I have promised to implement changeextarg(start,[stop]), which allows multi- character extended arguments, so that autoconf can reserve ${1} for shell output and ${{1}} for the day that autoconf 3.0 depends on m4 2.0. I will model it on changequote, including how it interacts with single- character quote syntax in changesyntax, except that an argument always must be supplied (to go with the policy that macros not beginning with "__" or "m4" must be blind, to
avoid risk of inadvertant expansion).

I don't think this is necessary. I explained in an earlier post that I would like to make changesyntax support multicharacter elements, and remove as many
of the changexxxx macros as possible, rather than introducing more.

3) I would like to implement ideas from sh, such as ${1-default} expanding to
the first argument if supplied, or `default' if omitted.

Nice :-)

I think that 2) is the only thing that should be completed before I feel comfortable baselining m4-1.9b for wider test exposure on alpha.gnu.org.

Or rather, the posix syntax be made a build time option until we can modularise it enough that run-time changing between posix and gnu syntax is possible.

+*** The GNU 1.4.x extension of recognizing the sequence `$10' in macro + definitions as the tenth positional parameter is withdrawn, as it is + incompatible with POSIX. The sequence `$10' now correctly refers to + the first positional parameter concatenated with 0. To directly access + the tenth parameter, you must now use extended arguments (you can also
+    portably access the tenth argument indirectly using the `shift'
+ builtin). To detect places in existing scripts that might be affected + by this change in behavior, you can use the `--warn-macro- sequence'
+    command-line option.

Please add a FIXME: before 1.9b gnu syntax should be the default.

+*** POSIX allows implementations to assign arbitrary behavior to the sequence + `${' in macro definitions. All earlier versions of GNU M4 just treated + it as literal output, but this version introduces extended arguments. + By default, the sequence `${<digits>}' now represents the extended
+    argument referring to a positional parameter, so that it is still
+ possible to directly refer to more than nine arguments. If the older + 1.4.x behavior of literal output is desired, the new `changesyntax' or + `changeextarg' builtins can be used to cripple extended arguments. To + detect places in existing scripts that might be affected by this change
+    in behavior, you can use the `--warn-macro-sequence' command-line
+    option.

Let's not mention changeextarg, unless we reach a point where we find there
is no cleaner way to implement it.

address@hidden address@hidden@address@hidden
address@hidden TODO: add m4macroseq builtin, and alias --m4macroseq
+Issue a warning if the regular expression @var{REGEXP} has a non- empty
+match in any macro definition (either by @code{define} or
address@hidden).  Empty matches are ignored; therefore, supplying the
+empty string as @var{REGEXP} disables any warning.  Otherwise,
address@hidden is compiled according to the current regular expression
+syntax.  If the optional @var{REGEXP} is not supplied, then a default
+regular expression is used, equivalent to
address@hidden(@address@hidden@}\|[0-9][0-9]+\)} in the @code{GNU_M4} regular
+expression flavor (a literal @samp{$} followed by multiple digits or by
+an open brace).  The default expression is chosen to detect the
+sequences that changed semantics in the default operation of
address@hidden M4 2.0 compared to earlier versions of GNU M4
+(@pxref{Extended Arguments}). Providing an alternate regular expression +can provide a useful reverse lookup feature of finding where a macro is
+defined to have a given definition, or accomodate uses of
address@hidden that intentionally alter extended argument syntax.

Again, lets move this into a module, or a upgraders' helper script.

address@hidden requires that if multiple digits appear after @samp {$},
+the first digit is used to select the parameter, and the remaining
+digits are concatenated as literal text.  Earlier versions of
address@hidden M4 had an incompatible extension that would use all of
+the digits to reference beyond the ninth argument, but this was changed +in M4 2.0. @xref{Extended Arguments}, for more details on this change.
+
address@hidden
+define(`foo', `$11')
address@hidden
+define(`a1', `hello')
address@hidden
+foo(`a', `b', `c', `d', `e', `f', `g', `h', `i', `j', `k', `l')
address@hidden
address@hidden example

Add this to a posix compliancy module subsection instead. Other adjustments as necessary to accomodate this idea... I won't point out the other places
in the patch that need to take this into account.

Index: m4/m4module.h
===================================================================
+/* The default sequence detects multi-digit parameters (obsolete after
+   1.4.x), and any use of extended arguments with the default ${}
+   syntax (new in 2.0).  */
+#define M4_DEFAULT_MACRO_SEQUENCE "\\$\\({[^}]*}\\|[0-9][0-9]+\\)"
+
+extern void    m4_macro_expand_input   (m4 *);
+extern void    m4_macro_call           (m4 *, m4_symbol_value *,
+                                         m4_obstack *, int,
+                                         m4_symbol_value **);
+extern void    m4_set_macro_sequence   (m4 *, const char *, int,
+                                         const char *);
+extern void    m4_free_macro_sequence  (m4 *);
+extern m4_symbol_value *m4_macro_define (m4 *, const char *, const char
*,
+                                         bool);
+extern void m4_check_macro_sequence (m4 *, const char *, const char *,
+                                         const char *);

Put the new functions in a loadable module instead.

+/* The regs_allocated field in an re_pattern_buffer refers to the
+   state of the re_registers struct used in successive matches with
+   the same compiled pattern.  */
+
+typedef struct {
+  struct re_pattern_buffer pat;        /* compiled regular expression */
+  struct re_registers regs;    /* match registers */
+} m4_pattern_buffer;
+
 extern const char *    m4_regexp_syntax_decode (int);
 extern int             m4_regexp_syntax_encode (const char *);
+extern m4_pattern_buffer *m4_regexp_compile    (m4 *, const char *,
+                                                 const char *, int,
+ bool, m4_pattern_buffer *);
+extern void            m4_regexp_free          (m4_pattern_buffer *);

Please don't do that!  There is module entry point export/import code in
the m4 module API... by moving the macro_sequence stuff into a module,
we can keep the regex code out of the core (important for people who
would like to build a tiny non-gnu m4). Worst case, the regex code might
need to go in a module of its own so that either the gnu module or my
proposed new macro_sequence module can each require it independently.

@@ -463,19 +501,18 @@
          break;

        default:
-         if (m4_get_posixly_correct_opt (context)
-             || !VALUE_ARG_SIGNATURE (value))
-           {
-             obstack_1grow (obs, ch);
-           }
-         else
+         if (VALUE_ARG_SIGNATURE (value))
            {
+             /* TODO - VALUE_ARG_SIGNATURE is not fully implemented.
+                Is it worth killing this as dead code, and figuring
+                out how to use extended arguments to do what was
+                originally envisioned by VALUE_ARG_SIGNATURE?  */

Yes, possibly -- or intergrating the two. For the record VALUE_ARG_SIGNATURE accesses a hash table of parameter names to values, built when the macro is
defined and referenced when ${argname} is expanded.  That is, however, a
different patch.

We need to be sure that defn correctly passes the contents of the macro
signature around too.

Another thing this was leading towards is maintaining enough details about
the macro arguments here that m4_define'd macros would also be able to
take advantage of the automatic checking for insufficient or excess arguments
that builtins currently have.

The implementation looks fine (location aside (; ).  Please add some
thorough tests to stricly define how the feature is supposed to work,
especially in the corner cases you mentioned.

Cheers,
        Gary
--
  ())_.              Email me: address@hidden
  ( '/           Read my blog: http://blog.azazil.net
  / )=         ...and my book: http://sources.redhat.com/autobook
`(_~)_ Join my AGLOCO Network: http://www.agloco.com/r/BBBS7912




Attachment: PGP.sig
Description: This is a digitally signed message part


reply via email to

[Prev in Thread] Current Thread [Next in Thread]