m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

argv_ref patch 22: allow concatenation of builtin tokens


From: Eric Blake
Subject: argv_ref patch 22: allow concatenation of builtin tokens
Date: Mon, 05 May 2008 21:55:17 -0600
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.14) Gecko/20080421 Thunderbird/2.0.0.14 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Next round of porting.  The last patch made it possible to use $@ to
concatenate a builtin token (the output of the defn macro) with ordinary
text; this patch extends the notion to allow concatenation in more
contexts.  There are still places that silently ignore builtin tokens, but
this undoes the need for a lot of the warnings added on 2007-11-14 for
actions that were previously impossible in the framework.  There is a
slight memory and speed penalty, since builtin tokens now last longer, but
nothing worth worrying about.  It also fixes a bug in the master branch
from an incomplete port of stage21; detected when the xfail for one of the
defn tests was removed.

It is still not possible to concatenate builtin tokens when calling
define; primarily because I don't have any idea on how to gracefully
handle the expansion of such a macro in GNU's framework.  Logically, it
seems like allowing a builtin token in the middle of a definition would
allow the user to avoid warnings:

define(`mydefn', `ifdef(`$1','defn(`defn')`)')

but that means that the builtin token for defn, when encountered in the
macro expansion of mydefn, must remember its $1 argument but not have any
result unless the ifdef chooses the branch containing the builtin.  And
it's not too much of a hardship, since the user can always do the same
thing without resorting to a builtin token:

define(`mydefn', `ifdef(`$1', `defn(`$1')')')

Solaris m4 allows simple uses like define(`foo', `a'defn(`divnum')), but I
haven't checked how it handles builtins that require arguments to be
useful.  There's also the question of what to do if two builtins with
clashing argument specifications are concatenated, as in defn(`divnum',`len').

So, this patch just leaves defining builtin concatenations as an
impossible task with room for future change, and issues a warning before
flattening.

2008-05-05  Eric Blake  <address@hidden>

        Stage 22: allow builtin token concatenation outside address@hidden
        Adjust the input and argument parsing engines to append builtins
        alongside text.  Make define warn when builtins must be
        flattened.
        Memory impact: slight penalty, with fewer builtins flattened.
        Speed impact: slight penalty, from more bookkeeping.
        * src/m4.h (arg_text): Add parameter.
        (ARG): Adjust callers.
        * src/input.c (init_macro_token): Add parameter.
        (next_token): Support concatenating builtins.
        * src/macro.c (warn_builtin_concat): Delete warning.
        (expand_argument, arg_adjust_refcount): Handle builtin tokens.
        (arg_text): Add parameter.
        (arg_print): Adjust caller.
        * src/builtin.c (define_macro): Flatten builtins, rather than
        doing nothing.
        (defn): Warn on undefined macro name.
        * src/m4.c (main): Avoid atoi.
        * src/output.c: Whitespace fixes.
        * doc/m4.texinfo (Defn): Document the new semantics.
        (Ifelse, Debug Levels, M4wrap): Enhance tests.
        * NEWS: Document this change.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkgf1qUACgkQ84KuGfSFAYBC+wCgqlB6lw2DiE7d6f6CRpmZy/hL
k4oAoIKLeDDdq0EhoA4E8GhfEbQfk3a5
=LAk3
-----END PGP SIGNATURE-----
>From c2a2811a8b81dac7b090dcd6f584742fed6dd085 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Sat, 3 May 2008 15:22:23 -0600
Subject: [PATCH] Stage 22: allow builtin token concatenation outside 
address@hidden

* m4/m4module.h (m4_is_arg_composite): New prototype.
(m4_symbol_value_copy): Change return type.
(m4_arg_text): Add parameter.
(M4ARG): Adjust callers.
* m4/m4private.h: Adjust comments.
* m4/symtab.c (m4_symbol_value_copy): Detect when builtins are
flattened.
* m4/input.c (init_builtin_token): Add parameter, and allow
concatenating builtins.
(m4__next_token): Adjust caller.
* m4/macro.c (m4_is_arg_composite): New function.
(expand_argument): Allow builtin concatenation.
(m4_arg_text): Add parameter.
(m4__arg_adjust_refcount, m4__arg_print): Adjust callers.
(m4_arg_equal): Fix comparison of builtin tokens.
* modules/m4.c (define, pushdef): Warn when flattening builtins.
* doc/m4.texinfo (Define): Remove dead comment.
(Defn): Update to reflect code changes.
* tests/builtins.at (defn): Remove xfail.
* NEWS: Document this change.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog         |   29 ++++++++++++
 NEWS              |   14 ++++++
 doc/m4.texinfo    |   93 +++++++++++++++++++++++---------------
 m4/input.c        |  129 ++++++++++++++++++-----------------------------------
 m4/m4module.h     |    7 ++-
 m4/m4private.h    |    2 +-
 m4/macro.c        |   69 +++++++++++++++++++---------
 m4/symtab.c       |   13 +++++-
 modules/m4.c      |    6 ++-
 tests/builtins.at |    5 --
 10 files changed, 211 insertions(+), 156 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index cd1f927..3470af1 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,32 @@
+2008-05-05  Eric Blake  <address@hidden>
+
+       Stage 22: allow builtin token concatenation outside address@hidden
+       Adjust the input and argument parsing engines to append builtins
+       alongside text.  Make define warn when builtins must be
+       flattened.
+       Memory impact: slight penalty, with fewer builtins flattened.
+       Speed impact: slight penalty, from more bookkeeping.
+       * m4/m4module.h (m4_is_arg_composite): New prototype.
+       (m4_symbol_value_copy): Change return type.
+       (m4_arg_text): Add parameter.
+       (M4ARG): Adjust callers.
+       * m4/m4private.h: Adjust comments.
+       * m4/symtab.c (m4_symbol_value_copy): Detect when builtins are
+       flattened.
+       * m4/input.c (init_builtin_token): Add parameter, and allow
+       concatenating builtins.
+       (m4__next_token): Adjust caller.
+       * m4/macro.c (m4_is_arg_composite): New function.
+       (expand_argument): Allow builtin concatenation.
+       (m4_arg_text): Add parameter.
+       (m4__arg_adjust_refcount, m4__arg_print): Adjust callers.
+       (m4_arg_equal): Fix comparison of builtin tokens.
+       * modules/m4.c (define, pushdef): Warn when flattening builtins.
+       * doc/m4.texinfo (Define): Remove dead comment.
+       (Defn): Update to reflect code changes.
+       * tests/builtins.at (defn): Remove xfail.
+       * NEWS: Document this change.
+
 2008-05-03  Eric Blake  <address@hidden>
 
        Document define_blind.
diff --git a/NEWS b/NEWS
index 205c651..35440ee 100644
--- a/NEWS
+++ b/NEWS
@@ -216,6 +216,10 @@ promoted to 2.0.
    using `builtin' or `indir' to perform nested `shift' calls triggered an
    assertion failure.
 
+** Fix regression introduced in 1.4.10b (but not present in 1.4.11) where
+   the command-line option -dV, as well as the builtin `debugmode(V)',
+   failed to enable `t' and `c' debug options.
+
 ** Fix the `m4wrap' builtin to accumulate wrapped text in FIFO order, as
    required by POSIX.  The manual mentions a way to restore the LIFO order
    present in earlier GNU M4 versions.  NOTE: this change exposes a bug
@@ -236,9 +240,19 @@ promoted to 2.0.
    then apply this patch:
      http://git.sv.gnu.org/gitweb/?p=autoconf.git;a=commitdiff;h=56d42fa71
 
+** The `defn' builtin now warns when operating on an undefined macro name.
+   To simulate 1.4.x behavior, use:
+     pushdef(`defn', `ifdef(`$1', `builtin(`defn', `$1')')')
+
 ** Enhance the `ifdef', `ifelse', and `shift' builtins, as well as all
    user macros, to transparently handle builtin tokens generated by `defn'.
 
+** Allow the concatenation of builtin macros with arbitrary text in
+   several contexts, via the `defn' builtin or argument expansion, rather
+   than warning and converting the builtin token to an empty string.
+   However, it is still not possible to use a concatenated builtin when
+   defining a macro.
+
 ** Enhance the `defn', `dumpdef', `ifdef', `popdef', `traceon', `traceoff',
    and `undefine' macros to warn when encountering a builtin token in the
    context of a macro name, rather than acting on the empty string.  This
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index 75d7fc8..e446765 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -1798,6 +1798,23 @@ Defines @var{name} to expand to @var{expansion}.  If
 The expansion of @code{define} is void.
 The macro @code{define} is recognized only with parameters.
 @end deffn
address@hidden Other implementations, such as Solaris, can define a macro
address@hidden with a builtin token attached to text:
address@hidden  define(foo, a`'defn(`divnum')b)
address@hidden  defn(`foo') => ab
address@hidden  dumpdef(`foo') => foo: a<divnum>b
address@hidden  len(defn(`foo')) => 3
address@hidden  index(defn(`foo'), defn(`divnum')) => 1
address@hidden  foo => a0b
address@hidden It may be worth making some changes to support this behavior,
address@hidden or something similar to it.
address@hidden
address@hidden But be sure it has sane semantics, with potentially deferred
address@hidden expansion of builtins.  For example, this should not warn
address@hidden about trying to access the definition of an undefined macro:
address@hidden  define(`foo', `ifdef(`$1', 'defn(`defn')`)')foo(`oops')
address@hidden Also, think how to handle conflicting argument counts:
address@hidden  define(`bar', defn(`dnl', `len'))
 
 The following example defines the macro @var{foo} to expand to the text
 @samp{Hello World.}.
@@ -1834,13 +1851,6 @@ definition of a macro if it has several definitions from 
@code{pushdef}
 (@pxref{Pushdef}).  Some other implementations of @code{m4} replace all
 definitions of a macro with @code{define}.  @xref{Incompatibilities},
 for more details.
address@hidden FIXME - See Austin group XCU ERN 118; this is considered
address@hidden ambiguous in the current version of POSIX.  The best thing to
address@hidden do here would probably be keep GNU semantics of popdef/pushdef
address@hidden in the m4 module unconditionally, then have a shadow builtin in
address@hidden the traditional module that does the undefine/pushdef
address@hidden semantics, rather than our current keying off of
address@hidden POSIXLY_CORRECT within the m4 module.
 
 As a @acronym{GNU} extension, the first argument to @code{define} does
 not have to be a simple word.
@@ -2175,19 +2185,9 @@ empty and triggers a warning.
 If @var{name} is a user-defined macro, the quoted definition is simply
 the quoted expansion text.  If, instead, @var{name} is a builtin, the
 expansion is a special token, which points to the builtin's internal
-definition.  This token is only meaningful as the second argument to
+definition.  This token meaningful primarily as the second argument to
 @code{define} (and @code{pushdef}), and is silently converted to an
-empty string in most other contexts.
address@hidden FIXME - Other implementations, such as Solaris, can pass a
address@hidden builtin token around to other macros, flattening it only on 
output:
address@hidden  define(foo, a`'defn(`divnum')b)
address@hidden  defn(`foo') => ab
address@hidden  dumpdef(`foo') => foo: a<divnum>b
address@hidden  len(defn(`foo')) => 3
address@hidden  index(defn(`foo'), defn(`divnum')) => 1
address@hidden  foo => a0b
address@hidden It may be worth making some changes to support this behavior,
address@hidden or something similar to it.
+empty string in many other contexts.
 
 The macro @code{defn} is recognized only with parameters.
 @end deffn
@@ -2349,28 +2349,49 @@ bar
 @result{}0
 @end example
 
-A warning is issued if @var{name} is undefined.  Also, at present,
-concatenating a builtin token with anything else is not supported as a
-macro definition, and a warning is issued.
address@hidden FIXME - handle defining macros with mixed text and builtins.
+A warning is issued if @var{name} is undefined.  Also note that as of M4
+1.6, @code{defn} with multiple arguments can join text with builtin
+tokens.  However, when defining a macro via @code{define} or
address@hidden, a warning is issued and the builtin token ignored if the
+builtin token does not occur in isolation.  A future version of
address@hidden M4 may lift this restriction.
 
address@hidden xfail
 @example
+$ @kbd{m4 -d}
 defn(`foo')
 @error{}m4:stdin:1: Warning: defn: undefined macro `foo'
 @result{}
-define(`echo', `$@@')
+define(`a', `A')define(`AA', `b')
 @result{}
-define(`foo', `a')
+traceon(`defn', `define')
 @result{}
-define(`bar', defn(`foo', `divnum'))
+defn(`a', `divnum', `a')
address@hidden: -1- defn(`a', `divnum', `a') -> ``A'<divnum>`A''
address@hidden
+define(`mydivnum', defn(`divnum', `divnum'))mydivnum
address@hidden: -2- defn(`divnum', `divnum') -> `<divnum><divnum>'
address@hidden:stdin:5: Warning: define: cannot concatenate builtins
address@hidden: -1- define(`mydivnum', `<divnum><divnum>') -> `'
 @result{}
-define(`blah', echo(defn(`divnum', `foo')))
+traceoff(`defn', `define')dumpdef(`mydivnum')
address@hidden:@tabchar{}`'
 @result{}
-bar
address@hidden
-blah
address@hidden
+define(`mydivnum', defn(`divnum')defn(`divnum'))mydivnum
address@hidden:stdin:7: Warning: define: cannot concatenate builtins
address@hidden
+define(`mydivnum', defn(`divnum')`a')mydivnum
address@hidden:stdin:8: Warning: define: cannot concatenate builtins
address@hidden
+define(`mydivnum', `a'defn(`divnum'))mydivnum
address@hidden:stdin:9: Warning: define: cannot concatenate builtins
address@hidden
+define(`q', ``$@@'')
address@hidden
+define(`foo', q(`a', defn(`divnum')))foo
address@hidden:stdin:11: Warning: define: cannot concatenate builtins
address@hidden,
+ifdef(`foo', `yes', `no')
address@hidden
 @end example
 
 @node Pushdef
diff --git a/m4/input.c b/m4/input.c
index 0f48768..1f89916 100644
--- a/m4/input.c
+++ b/m4/input.c
@@ -114,7 +114,8 @@ static      int     eof_read                (m4_input_block 
*, m4 *, bool, bool,
                                         bool);
 static void    eof_unget               (m4_input_block *, int);
 
-static void    init_builtin_token      (m4 *, m4_symbol_value *);
+static void    init_builtin_token      (m4 *, m4_obstack *,
+                                        m4_symbol_value *);
 static void    append_quote_token      (m4 *, m4_obstack *,
                                         m4_symbol_value *);
 static bool    match_input             (m4 *, const char *, bool);
@@ -449,18 +450,18 @@ m4_push_string_init (m4 *context)
    rather than copying everything consecutively onto the input stack.
    Must be called between push_string_init and push_string_finish.
 
-   If VALUE contains text, then convert the current input block into a
-   chain if it is not one already, and add the contents of VALUE as a
-   new link in the chain.  LEVEL describes the current expansion
-   level, or SIZE_MAX if VALUE is composite, its contents reside
-   entirely on the current_input stack, and VALUE lives in temporary
-   storage.  If VALUE is a simple string, then it belongs to the
-   current macro expansion.  If VALUE is composite, then each text
-   link has a level of SIZE_MAX if it belongs to the current macro
-   expansion, otherwise it is a back-reference where level tracks
-   which stack it came from.  The resulting input block chain contains
-   links with a level of SIZE_MAX if the text belongs to the input
-   stack, otherwise the level where the back-reference comes from.
+   Convert the current input block into a chain if it is not one
+   already, and add the contents of VALUE as a new link in the chain.
+   LEVEL describes the current expansion level, or SIZE_MAX if VALUE
+   is composite, its contents reside entirely on the current_input
+   stack, and VALUE lives in temporary storage.  If VALUE is a simple
+   string, then it belongs to the current macro expansion.  If VALUE
+   is composite, then each text link has a level of SIZE_MAX if it
+   belongs to the current macro expansion, otherwise it is a
+   back-reference where level tracks which stack it came from.  The
+   resulting input block chain contains links with a level of SIZE_MAX
+   if the text belongs to the input stack, otherwise the level where
+   the back-reference comes from.
 
    Return true only if a reference was created to the contents of
    VALUE, in which case, LEVEL is less than SIZE_MAX and the lifetime
@@ -1122,18 +1123,36 @@ m4_pop_wrapup (m4 *context)
 }
 
 /* Populate TOKEN with the builtin token at the top of the input
-   stack, then consume the input.  If TOKEN is NULL, discard the
-   builtin token instead.  */
+   stack, then consume the input.  If OBS, TOKEN will be converted to
+   a composite token using storage from OBS as necessary; otherwise,
+   if TOKEN is NULL, the builtin token is discarded.  */
 static void
-init_builtin_token (m4 *context, m4_symbol_value *token)
+init_builtin_token (m4 *context, m4_obstack *obs, m4_symbol_value *token)
 {
   m4__symbol_chain *chain;
   assert (isp->funcs == &composite_funcs);
   chain = isp->u.u_c.chain;
   assert (!chain->quote_age && chain->type == M4__CHAIN_FUNC
          && chain->u.builtin);
-  if (token)
-    m4__set_symbol_value_builtin (token, chain->u.builtin);
+  if (obs)
+    {
+      assert (token);
+      if (token->type == M4_SYMBOL_VOID)
+       {
+         token->type = M4_SYMBOL_COMP;
+         token->u.u_c.chain = token->u.u_c.end = NULL;
+         token->u.u_c.wrapper = false;
+         token->u.u_c.has_func = false;
+       }
+      assert (token->type == M4_SYMBOL_COMP);
+      m4__append_builtin (obs, chain->u.builtin, &token->u.u_c.chain,
+                         &token->u.u_c.end);
+    }
+  else if (token)
+    {
+      assert (token->type == M4_SYMBOL_VOID);
+      m4__set_symbol_value_builtin (token, chain->u.builtin);
+    }
   chain->u.builtin = NULL;
 }
 
@@ -1535,7 +1554,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
 
     if (ch == CHAR_BUILTIN)            /* BUILTIN TOKEN */
       {
-       init_builtin_token (context, token);
+       init_builtin_token (context, obs, token);
 #ifdef DEBUG_INPUT
        m4_print_token (context, "next_token", M4_TOKEN_MACDEF, token);
 #endif
@@ -1590,34 +1609,8 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
              m4_error_at_line (context, EXIT_FAILURE, 0, file, *line, caller,
                                _("end of file in string"));
            if (ch == CHAR_BUILTIN)
-             {
-               /* TODO support concatenation of builtins.  */
-               if (obstack_object_size (obs_safe) == 0
-                   && token->type == M4_SYMBOL_VOID)
-                 {
-                   /* Strip quotes if they surround a lone builtin
-                      token.  */
-                   assert (quote_level == 1);
-                   init_builtin_token (context, token);
-                   ch = peek_char (context, false);
-                   if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_RQUOTE))
-                     {
-                       ch = next_char (context, false, false, false);
-#ifdef DEBUG_INPUT
-                       m4_print_token (context, "next_token", M4_TOKEN_MACDEF,
-                                       token);
-#endif
-                       return M4_TOKEN_MACDEF;
-                     }
-                   token->type = M4_SYMBOL_VOID;
-                 }
-               else
-                 init_builtin_token (context, NULL);
-               m4_warn_at_line (context, 0, file, *line, caller,
-                                _("cannot quote builtin"));
-               continue;
-             }
-           if (ch == CHAR_QUOTE)
+             init_builtin_token (context, obs, obs ? token : NULL);
+           else if (ch == CHAR_QUOTE)
              append_quote_token (context, obs, token);
            else if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_RQUOTE))
              {
@@ -1649,36 +1642,8 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
              m4_error_at_line (context, EXIT_FAILURE, 0, file, *line, caller,
                                _("end of file in string"));
            if (ch == CHAR_BUILTIN)
-             {
-               /* TODO support concatenation of builtins.  */
-               if (obstack_object_size (obs_safe) == 0
-                   && token->type == M4_SYMBOL_VOID)
-                 {
-                   /* Strip quotes if they surround a lone builtin
-                      token.  */
-                   assert (quote_level == 1);
-                   init_builtin_token (context, token);
-                   ch = peek_char (context, false);
-                   if (MATCH (context, ch, context->syntax->quote.str2,
-                              false))
-                     {
-                       ch = next_char (context, false, false, false);
-                       MATCH (context, ch, context->syntax->quote.str2, true);
-#ifdef DEBUG_INPUT
-                       m4_print_token (context, "next_token", M4_TOKEN_MACDEF,
-                                       token);
-#endif
-                       return M4_TOKEN_MACDEF;
-                     }
-                   token->type = M4_SYMBOL_VOID;
-                 }
-               else
-                 init_builtin_token (context, NULL);
-               m4_warn_at_line (context, 0, file, *line, caller,
-                                _("cannot quote builtin"));
-               continue;
-             }
-           if (MATCH (context, ch, context->syntax->quote.str2, true))
+             init_builtin_token (context, obs, obs ? token : NULL);
+           else if (MATCH (context, ch, context->syntax->quote.str2, true))
              {
                if (--quote_level == 0)
                  break;
@@ -1708,10 +1673,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
                                _("end of file in comment"));
            if (ch == CHAR_BUILTIN)
              {
-               /* TODO support concatenation of builtins.  */
-               m4_warn_at_line (context, 0, file, *line, caller,
-                                _("cannot comment builtin"));
-               init_builtin_token (context, NULL);
+               init_builtin_token (context, NULL, NULL);
                continue;
              }
            if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_ECOMM))
@@ -1740,10 +1702,7 @@ m4__next_token (m4 *context, m4_symbol_value *token, int 
*line,
                                _("end of file in comment"));
            if (ch == CHAR_BUILTIN)
              {
-               /* TODO support concatenation of builtins.  */
-               m4_warn_at_line (context, 0, file, *line, caller,
-                                _("cannot comment builtin"));
-               init_builtin_token (context, NULL);
+               init_builtin_token (context, NULL, NULL);
                continue;
              }
            if (MATCH (context, ch, context->syntax->comm.str2, true))
diff --git a/m4/m4module.h b/m4/m4module.h
index 5b5e01b..21a8339 100644
--- a/m4/m4module.h
+++ b/m4/m4module.h
@@ -153,7 +153,7 @@ struct m4_string_pair
 /* Grab the text contents of argument I, or abort if the argument is
    not text.  Assumes that `m4 *context' and `m4_macro_args *argv' are
    in scope.  */
-#define M4ARG(i) m4_arg_text (context, argv, i)
+#define M4ARG(i) m4_arg_text (context, argv, i, false)
 
 /* Grab the length of the text contents of argument I, or abort if the
    argument is not text.  Assumes that `m4 *context' and
@@ -312,7 +312,7 @@ extern bool m4_symbol_value_flatten_args (m4_symbol_value 
*);
 
 extern m4_symbol_value *m4_symbol_value_create   (void);
 extern void            m4_symbol_value_delete    (m4_symbol_value *);
-extern void            m4_symbol_value_copy      (m4 *, m4_symbol_value *,
+extern bool            m4_symbol_value_copy      (m4 *, m4_symbol_value *,
                                                   m4_symbol_value *);
 extern bool            m4_is_symbol_value_text   (m4_symbol_value *);
 extern bool            m4_is_symbol_value_func   (m4_symbol_value *);
@@ -352,7 +352,8 @@ extern size_t       m4_arg_argc             (m4_macro_args 
*);
 extern m4_symbol_value *m4_arg_symbol  (m4_macro_args *, size_t);
 extern bool    m4_is_arg_text          (m4_macro_args *, size_t);
 extern bool    m4_is_arg_func          (m4_macro_args *, size_t);
-extern const char *m4_arg_text         (m4 *, m4_macro_args *, size_t);
+extern bool    m4_is_arg_composite     (m4_macro_args *, size_t);
+extern const char *m4_arg_text         (m4 *, m4_macro_args *, size_t, bool);
 extern bool    m4_arg_equal            (m4 *, m4_macro_args *, size_t,
                                         size_t);
 extern bool    m4_arg_empty            (m4_macro_args *, size_t);
diff --git a/m4/m4private.h b/m4/m4private.h
index 48a0075..7e1f7a8 100644
--- a/m4/m4private.h
+++ b/m4/m4private.h
@@ -532,7 +532,7 @@ typedef enum {
   M4_TOKEN_COMMA,      /* Argument separator, M4_SYMBOL_TEXT.  */
   M4_TOKEN_CLOSE,      /* Argument list end, M4_SYMBOL_TEXT.  */
   M4_TOKEN_SIMPLE,     /* Single character, M4_SYMBOL_TEXT.  */
-  M4_TOKEN_MACDEF,     /* Macro's definition (see "defn"), M4_SYMBOL_FUNC.  */
+  M4_TOKEN_MACDEF,     /* Builtin token, M4_SYMBOL_FUNC or M4_SYMBOL_COMP.  */
   M4_TOKEN_ARGV                /* A series of parameters, M4_SYMBOL_COMP.  */
 } m4__token_type;
 
diff --git a/m4/macro.c b/m4/macro.c
index e58e657..f5bec18 100644
--- a/m4/macro.c
+++ b/m4/macro.c
@@ -327,15 +327,10 @@ expand_argument (m4 *context, m4_obstack *obs, 
m4_symbol_value *argp,
        case M4_TOKEN_CLOSE:
          if (paren_level == 0)
            {
-             /* FIXME - For now, we match the behavior of the branch,
-                except we don't issue warnings.  But in the future,
-                we want to allow concatenation of builtins and
-                text.  */
-             len = obstack_object_size (obs);
-             if (argp->type == M4_SYMBOL_FUNC && !len)
-               return type == M4_TOKEN_COMMA;
+             assert (argp->type != M4_SYMBOL_FUNC);
              if (argp->type != M4_SYMBOL_COMP)
                {
+                 len = obstack_object_size (obs);
                  VALUE_MODULE (argp) = NULL;
                  if (len)
                    {
@@ -347,7 +342,16 @@ expand_argument (m4 *context, m4_obstack *obs, 
m4_symbol_value *argp,
                    m4_set_symbol_value_text (argp, "", len, 0);
                }
              else
-               m4__make_text_link (obs, NULL, &argp->u.u_c.end);
+               {
+                 m4__make_text_link (obs, NULL, &argp->u.u_c.end);
+                 if (argp->u.u_c.chain == argp->u.u_c.end
+                     && argp->u.u_c.chain->type == M4__CHAIN_FUNC)
+                   {
+                     const m4__builtin *func = argp->u.u_c.chain->u.builtin;
+                     argp->type = M4_SYMBOL_FUNC;
+                     argp->u.builtin = func;
+                   }
+               }
              return type == M4_TOKEN_COMMA;
            }
          /* fallthru */
@@ -369,6 +373,7 @@ expand_argument (m4 *context, m4_obstack *obs, 
m4_symbol_value *argp,
        case M4_TOKEN_WORD:
        case M4_TOKEN_SPACE:
        case M4_TOKEN_STRING:
+       case M4_TOKEN_MACDEF:
          if (!expand_token (context, obs, type, &token, line, first))
            age = 0;
          if (token.type == M4_SYMBOL_COMP)
@@ -390,13 +395,6 @@ expand_argument (m4 *context, m4_obstack *obs, 
m4_symbol_value *argp,
            }
          break;
 
-       case M4_TOKEN_MACDEF:
-         if (argp->type == M4_SYMBOL_VOID && obstack_object_size (obs) == 0)
-           m4_symbol_value_copy (context, argp, &token);
-         else
-           argp->type = M4_SYMBOL_TEXT;
-         break;
-
        case M4_TOKEN_ARGV:
          assert (paren_level == 0 && argp->type == M4_SYMBOL_VOID
                  && obstack_object_size (obs) == 0
@@ -1025,6 +1023,8 @@ m4__arg_adjust_refcount (m4 *context, m4_macro_args 
*argv, bool increase)
                    m4__adjust_refcount (context, chain->u.u_s.level,
                                         increase);
                  break;
+               case M4__CHAIN_FUNC:
+                 break;
                case M4__CHAIN_ARGV:
                  assert (chain->u.u_a.argv->inuse);
                  m4__arg_adjust_refcount (context, chain->u.u_a.argv,
@@ -1219,7 +1219,6 @@ m4_is_arg_text (m4_macro_args *argv, size_t arg)
   return false;
 }
 
-/* TODO - add m4_is_arg_comp to distinguish concatenation of builtins.  */
 /* Given ARGV, return true if argument ARG is a single builtin
    function.  Only non-zero indices less than argc can return
    true.  */
@@ -1231,12 +1230,28 @@ m4_is_arg_func (m4_macro_args *argv, size_t arg)
   return m4_is_symbol_value_func (m4_arg_symbol (argv, arg));
 }
 
+/* Given ARGV, return true if argument ARG contains a builtin token
+   concatenated with anything else.  Only non-zero indices less than
+   argc can return true.  */
+bool
+m4_is_arg_composite (m4_macro_args *argv, size_t arg)
+{
+  m4_symbol_value *value;
+  if (arg == 0 || argv->argc <= arg || argv->flatten || !argv->has_func)
+    return false;
+  value = m4_arg_symbol (argv, arg);
+  if (value->type == M4_SYMBOL_COMP && value->u.u_c.has_func)
+    return true;
+  return false;
+}
+
 /* Given ARGV, return the text at argument ARG.  Abort if the argument
    is not text.  Arg 0 is always text, and indices beyond argc return
-   the empty string.  The result is always NUL-terminated, even if it
-   includes embedded NUL characters.  */
+   the empty string.  If FLATTEN, builtins are ignored.  The result is
+   always NUL-terminated, even if it includes embedded NUL
+   characters.  */
 const char *
-m4_arg_text (m4 *context, m4_macro_args *argv, size_t arg)
+m4_arg_text (m4 *context, m4_macro_args *argv, size_t arg, bool flatten)
 {
   m4_symbol_value *value;
   m4__symbol_chain *chain;
@@ -1246,7 +1261,7 @@ m4_arg_text (m4 *context, m4_macro_args *argv, size_t arg)
     return argv->argv0;
   if (argv->argc <= arg)
     return "";
-  value = m4_arg_symbol (argv, arg);
+  value = arg_symbol (argv, arg, NULL, flatten);
   if (m4_is_symbol_value_text (value))
     return m4_get_symbol_value_text (value);
   assert (value->type == M4_SYMBOL_COMP);
@@ -1259,12 +1274,18 @@ m4_arg_text (m4 *context, m4_macro_args *argv, size_t 
arg)
        case M4__CHAIN_STR:
          obstack_grow (obs, chain->u.u_s.str, chain->u.u_s.len);
          break;
+       case M4__CHAIN_FUNC:
+         if (flatten)
+           break;
+         assert (!"m4_arg_text");
+         abort ();
        case M4__CHAIN_ARGV:
+         assert (!chain->u.u_a.has_func || flatten || argv->flatten);
          m4__arg_print (context, obs, chain->u.u_a.argv, chain->u.u_a.index,
                         m4__quote_cache (M4SYNTAX, NULL, chain->quote_age,
                                          chain->u.u_a.quotes),
-                        argv->flatten || chain->u.u_a.flatten, NULL, NULL,
-                        NULL, false, false);
+                        flatten || argv->flatten || chain->u.u_a.flatten,
+                        NULL, NULL, NULL, false, false);
          break;
        default:
          assert (!"m4_arg_text");
@@ -1368,6 +1389,9 @@ m4_arg_equal (m4 *context, m4_macro_args *argv, size_t 
indexa, size_t indexb)
        {
          tmpb.next = NULL;
          tmpb.type = M4__CHAIN_STR;
+         tmpb.u.u_s.str = NULL;
+         tmpb.u.u_s.len = 0;
+         chain = &tmpb;
          m4__arg_print (context, obs, cb->u.u_a.argv, cb->u.u_a.index,
                         m4__quote_cache (M4SYNTAX, NULL, cb->quote_age,
                                          cb->u.u_a.quotes),
@@ -1526,6 +1550,7 @@ m4__arg_print (m4 *context, m4_obstack *obs, 
m4_macro_args *argv, size_t arg,
   size_t sep_len;
   size_t *plen = quote_each ? NULL : &len;
 
+  flatten |= argv->flatten;
   if (chainp)
     assert (!max_len && *chainp);
   if (!sep)
diff --git a/m4/symtab.c b/m4/symtab.c
index 69f2200..76ff1cb 100644
--- a/m4/symtab.c
+++ b/m4/symtab.c
@@ -405,10 +405,13 @@ arg_destroy_CB (m4_hash *hash, const void *name, void 
*arg, void *ignored)
   return NULL;
 }
 
-void
+/* Copy the symbol SRC into DEST.  Return true if builtin tokens were
+   flattened.  */
+bool
 m4_symbol_value_copy (m4 *context, m4_symbol_value *dest, m4_symbol_value *src)
 {
   m4_symbol_value *next;
+  bool result = false;
 
   assert (dest);
   assert (src);
@@ -455,7 +458,7 @@ m4_symbol_value_copy (m4 *context, m4_symbol_value *dest, 
m4_symbol_value *src)
       }
       break;
     case M4_SYMBOL_FUNC:
-      /* Nothing further to do.  */
+      m4__set_symbol_value_builtin (dest, src->u.builtin);
       break;
     case M4_SYMBOL_PLACEHOLDER:
       m4_set_symbol_value_placeholder (dest,
@@ -476,9 +479,14 @@ m4_symbol_value_copy (m4 *context, m4_symbol_value *dest, 
m4_symbol_value *src)
              case M4__CHAIN_STR:
                obstack_grow (obs, chain->u.u_s.str, chain->u.u_s.len);
                break;
+             case M4__CHAIN_FUNC:
+               result = true;
+               break;
              case M4__CHAIN_ARGV:
                quotes = m4__quote_cache (M4SYNTAX, NULL, chain->quote_age,
                                          chain->u.u_a.quotes);
+               if (chain->u.u_a.has_func && !chain->u.u_a.flatten)
+                 result = true;
                m4__arg_print (context, obs, chain->u.u_a.argv,
                               chain->u.u_a.index, quotes, true, NULL, NULL,
                               NULL, false, false);
@@ -503,6 +511,7 @@ m4_symbol_value_copy (m4 *context, m4_symbol_value *dest, 
m4_symbol_value *src)
   if (VALUE_ARG_SIGNATURE (src))
     VALUE_ARG_SIGNATURE (dest) = m4_hash_dup (VALUE_ARG_SIGNATURE (src),
                                              arg_copy_CB);
+  return result;
 }
 
 static void *
diff --git a/modules/m4.c b/modules/m4.c
index 0b714ef..5cb6d11 100644
--- a/modules/m4.c
+++ b/modules/m4.c
@@ -157,7 +157,8 @@ M4BUILTIN_HANDLER (define)
     {
       m4_symbol_value *value = m4_symbol_value_create ();
 
-      m4_symbol_value_copy (context, value, m4_arg_symbol (argv, 2));
+      if (m4_symbol_value_copy (context, value, m4_arg_symbol (argv, 2)))
+       m4_warn (context, 0, M4ARG (0), _("cannot concatenate builtins"));
       m4_symbol_define (M4SYMTAB, M4ARG (1), value);
     }
   else
@@ -179,7 +180,8 @@ M4BUILTIN_HANDLER (pushdef)
     {
       m4_symbol_value *value = m4_symbol_value_create ();
 
-      m4_symbol_value_copy (context, value, m4_arg_symbol (argv, 2));
+      if (m4_symbol_value_copy (context, value, m4_arg_symbol (argv, 2)))
+       m4_warn (context, 0, M4ARG (0), _("cannot concatenate builtins"));
       m4_symbol_pushdef (M4SYMTAB, M4ARG (1), value);
     }
   else
diff --git a/tests/builtins.at b/tests/builtins.at
index b059e7b..3f67c2c 100644
--- a/tests/builtins.at
+++ b/tests/builtins.at
@@ -230,11 +230,6 @@ AT_CLEANUP
 
 AT_SETUP([defn])
 
-dnl This test is a reminder that defn needs to be fixed to handle
-dnl concatenation of builtin tokens with text, and user macros need
-dnl to handle builtin tokens without flattening.
-AT_XFAIL_IF([:])
-
 AT_DATA([[in.m4]],
 [[define(`e', `$@')define(`q', ``$@'')define(`u', `$*')
 define(`cmp', `ifelse($1, $2, `yes', `no')')define(`d', defn(`defn'))
-- 
1.5.5.1

>From 50fabc46c235da6682f1bd76b1b43151e147c7bc Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Thu, 6 Dec 2007 14:47:26 -0700
Subject: [PATCH] Stage 22: allow builtin token concatenation outside 
address@hidden

* src/m4.h (arg_text): Add parameter.
(ARG): Adjust callers.
* src/input.c (init_macro_token): Add parameter.
(next_token): Support concatenating builtins.
* src/macro.c (warn_builtin_concat): Delete warning.
(expand_argument, arg_adjust_refcount): Handle builtin tokens.
(arg_text): Add parameter.
(arg_print): Adjust caller.
* src/builtin.c (define_macro): Flatten builtins, rather than
doing nothing.
(defn): Warn on undefined macro name.
* src/m4.c (main): Avoid atoi.
* src/output.c: Whitespace fixes.
* doc/m4.texinfo (Defn): Document the new semantics.
(Ifelse, Debug Levels, M4wrap): Enhance tests.
* NEWS: Document this change.

(cherry picked from commit 8a47a2029b7eb60ac61abb1b6423d4a67b371281)

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog      |   25 ++++++++++++++++
 NEWS           |   10 ++++++
 doc/m4.texinfo |   80 ++++++++++++++++++++++++++++------------------------
 src/builtin.c  |   21 ++++++-------
 src/input.c    |   84 +++++++++++++++++++++++--------------------------------
 src/m4.c       |    2 +-
 src/m4.h       |    6 ++--
 src/macro.c    |   76 ++++++++++++++++++++------------------------------
 src/output.c   |    4 +-
 9 files changed, 159 insertions(+), 149 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 6362809..91c1845 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,28 @@
+2008-05-05  Eric Blake  <address@hidden>
+
+       Stage 22: allow builtin token concatenation outside address@hidden
+       Adjust the input and argument parsing engines to append builtins
+       alongside text.  Make define warn when builtins must be
+       flattened.
+       Memory impact: slight penalty, with fewer builtins flattened.
+       Speed impact: slight penalty, from more bookkeeping.
+       * src/m4.h (arg_text): Add parameter.
+       (ARG): Adjust callers.
+       * src/input.c (init_macro_token): Add parameter.
+       (next_token): Support concatenating builtins.
+       * src/macro.c (warn_builtin_concat): Delete warning.
+       (expand_argument, arg_adjust_refcount): Handle builtin tokens.
+       (arg_text): Add parameter.
+       (arg_print): Adjust caller.
+       * src/builtin.c (define_macro): Flatten builtins, rather than
+       doing nothing.
+       (defn): Warn on undefined macro name.
+       * src/m4.c (main): Avoid atoi.
+       * src/output.c: Whitespace fixes.
+       * doc/m4.texinfo (Defn): Document the new semantics.
+       (Ifelse, Debug Levels, M4wrap): Enhance tests.
+       * NEWS: Document this change.
+
 2008-05-03  Eric Blake  <address@hidden>
 
        Document define_blind.
diff --git a/NEWS b/NEWS
index 0ca3094..052cbbc 100644
--- a/NEWS
+++ b/NEWS
@@ -33,9 +33,19 @@ Foundation, Inc.
    then apply this patch:
      http://git.sv.gnu.org/gitweb/?p=autoconf.git;a=commitdiff;h=56d42fa71
 
+** The `defn' builtin now warns when operating on an undefined macro name.
+   To simulate 1.4.x behavior, use:
+     pushdef(`defn', `ifdef(`$1', `builtin(`defn', `$1')')')
+
 ** Enhance the `ifdef', `ifelse', and `shift' builtins, as well as all
    user macros, to transparently handle builtin tokens generated by `defn'.
 
+** Allow the concatenation of builtin macros with arbitrary text in
+   several contexts, via the `defn' builtin or argument expansion, rather
+   than warning and converting the builtin token to an empty string.
+   However, it is still not possible to use a concatenated builtin when
+   defining a macro.
+
 ** Enhance the `defn', `dumpdef', `ifdef', `popdef', `traceon', `traceoff',
    and `undefine' macros to warn when encountering a builtin token in the
    context of a macro name, rather than acting on the empty string.  This
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index 4781567..fd04622 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -2112,17 +2112,14 @@ the builtin @code{defn}:
 @deffn Builtin defn (@address@hidden)
 Expands to the @emph{quoted definition} of each @var{name}.  If an
 argument is not a defined macro, the expansion for that argument is
-empty.
+empty and triggers a warning.
 
 If @var{name} is a user-defined macro, the quoted definition is simply
-the quoted expansion text.  If, instead, there is only one @var{name}
-and it is a builtin, the
+the quoted expansion text.  If, instead, @var{name} is a builtin, the
 expansion is a special token, which points to the builtin's internal
-definition.  This token is only meaningful as the second argument to
+definition.  This token meaningful primarily as the second argument to
 @code{define} (and @code{pushdef}), and is silently converted to an
-empty string in most other contexts.  Using multiple @var{name} to
-combine a builtin with anything else is not supported; a warning is
-issued and the builtin is omitted from the final expansion.
+empty string in many other contexts.
 
 The macro @code{defn} is recognized only with parameters.
 @end deffn
@@ -2281,14 +2278,18 @@ bar
 @result{}0
 @end example
 
-Also note that as of M4 1.6, @code{defn} with multiple arguments can
-join text with builtin tokens.  However, when collecting macro
-arguments, a builtin token is preserved only when it occurs in
-isolation.  A future version of @acronym{GNU} M4 may lift this
-restriction.
+A warning is issued if @var{name} is undefined.  Also note that as of M4
+1.6, @code{defn} with multiple arguments can join text with builtin
+tokens.  However, when defining a macro via @code{define} or
address@hidden, a warning is issued and the builtin token ignored if the
+builtin token does not occur in isolation.  A future version of
address@hidden M4 may lift this restriction.
 
 @example
 $ @kbd{m4 -d}
+defn(`foo')
address@hidden:stdin:1: Warning: defn: undefined macro `foo'
address@hidden
 define(`a', `A')define(`AA', `b')
 @result{}
 traceon(`defn', `define')
@@ -2298,29 +2299,28 @@ defn(`a', `divnum', `a')
 @result{}AA
 define(`mydivnum', defn(`divnum', `divnum'))mydivnum
 @error{}m4trace: -2- defn(`divnum', `divnum') -> `<divnum><divnum>'
address@hidden:stdin:4: Warning: define: cannot concatenate builtin `divnum'
address@hidden:stdin:4: Warning: define: cannot concatenate builtin `divnum'
address@hidden: -1- define(`mydivnum', `')
address@hidden:stdin:5: Warning: define: cannot concatenate builtins
address@hidden: -1- define(`mydivnum', `<divnum><divnum>')
 @result{}
-traceoff(`defn', `define')
+traceoff(`defn', `define')dumpdef(`mydivnum')
address@hidden:@tabchar{}`'
 @result{}
 define(`mydivnum', defn(`divnum')defn(`divnum'))mydivnum
address@hidden:stdin:6: Warning: define: cannot concatenate builtin `divnum'
address@hidden:stdin:6: Warning: define: cannot concatenate builtin `divnum'
address@hidden:stdin:7: Warning: define: cannot concatenate builtins
 @result{}
 define(`mydivnum', defn(`divnum')`a')mydivnum
address@hidden:stdin:7: Warning: define: cannot concatenate builtin `divnum'
address@hidden:stdin:8: Warning: define: cannot concatenate builtins
 @result{}A
 define(`mydivnum', `a'defn(`divnum'))mydivnum
address@hidden:stdin:8: Warning: define: cannot concatenate builtin `divnum'
address@hidden:stdin:9: Warning: define: cannot concatenate builtins
 @result{}A
 define(`q', ``$@@'')
 @result{}
 define(`foo', q(`a', defn(`divnum')))foo
address@hidden:stdin:10: Warning: define: cannot concatenate builtins
address@hidden
address@hidden:stdin:11: Warning: define: cannot concatenate builtins
address@hidden,
 ifdef(`foo', `yes', `no')
address@hidden
address@hidden
 @end example
 
 @node Pushdef
@@ -2853,6 +2853,8 @@ ifelse(defn(`defn'), defn(`divnum'), `yes', `no')
 @result{}no
 ifelse(defn(`defn'), defn(`defn'), `yes', `no')
 @result{}yes
+ifelse(defn(`defn', `divnum'), defn(`defn')defn(`divnum'), `yes', `no')
address@hidden
 define(`foo', ifelse(`', `', defn(`divnum')))
 @result{}
 foo
@@ -2917,8 +2919,6 @@ ifelse(`-01234567890123456789', `-'e(long)`-', `yes', 
`no')
 @result{}no
 @end example
 
address@hidden It would be nice to allow concatenation of builtins without
address@hidden using $@ handling.
 @example
 define(`e', `$@@')define(`q', ``$@@'')define(`u', `$*')
 @result{}
@@ -2928,16 +2928,12 @@ cmp(`defn(`defn')', `defn(`d')')
 @result{}yes
 cmp(`defn(`defn')', ``<defn>'')
 @result{}no
-cmp(`q(defn(`defn'))', `q(defn(`d'))')-fixme
address@hidden:stdin:5: Warning: ifelse: cannot quote builtin
address@hidden:stdin:5: Warning: ifelse: cannot quote builtin
address@hidden
-cmp(`q(defn(`defn'))', `q(`<defn>')')-fixme
address@hidden:stdin:6: Warning: ifelse: cannot quote builtin
address@hidden
-cmp(`q(defn(`defn'))', ``'')-fixme
address@hidden:stdin:7: Warning: ifelse: cannot quote builtin
address@hidden
+cmp(`q(defn(`defn'))', `q(defn(`d'))')
address@hidden
+cmp(`q(defn(`defn'))', `q(`<defn>')')
address@hidden
+cmp(`q(defn(`defn'))', ``'')
address@hidden
 cmp(`q(`1', `2', defn(`defn'))', `q(`1', `2', defn(`d'))')
 @result{}yes
 cmp(`q(`1', `2', defn(`defn'))', `q(`1', `2', `<defn>')')
@@ -3834,7 +3830,17 @@ debugmode()
 foo
 @error{}m4trace: -1- foo -> `FOO'
 @result{}FOO
+debugmode(`V')
address@hidden
+foo(`ignored')
address@hidden:stdin:6: -1- id 6: foo ...
address@hidden:stdin:6: -1- id 6: foo(`ignored') -> ???
address@hidden:stdin:6: -1- id 6: foo(...) -> `FOO'
address@hidden
 debugmode
address@hidden:stdin:7: -1- id 7: debugmode ...
address@hidden:stdin:7: -1- id 7: debugmode -> ???
address@hidden
 @result{}
 foo
 @error{}m4trace: -1- foo
@@ -3842,7 +3848,7 @@ foo
 debugmode(`+l')
 @result{}
 foo
address@hidden:8: -1- foo
address@hidden:10: -1- foo
 @result{}FOO
 @end example
 
@@ -4799,7 +4805,7 @@ than computing the builtin token up front, as is done for 
@code{bar}.
 m4wrap(`define(`foo', defn(`divnum'))foo
 ')
 @result{}
-m4wrap(`define(`bar', ')m4wrap(defn(`divnum'))m4wrap(`)bar
+m4wrap(`define(`bar', 'defn(`divnum')`)bar
 ')
 @result{}
 ^D
diff --git a/src/builtin.c b/src/builtin.c
index 2e963e3..8ce6cf7 100644
--- a/src/builtin.c
+++ b/src/builtin.c
@@ -678,13 +678,9 @@ define_macro (int argc, macro_arguments *argv, 
symbol_lookup mode)
     {
     case TOKEN_COMP:
       m4_warn (0, me, _("cannot concatenate builtins"));
-      /* TODO fall through instead.  */
-      break;
-
+      /* fallthru */
     case TOKEN_TEXT:
-      /* TODO flatten TOKEN_COMP value, or support concatenation of
-        builtins in definitions.  */
-      define_user_macro (ARG (1), ARG_LEN (1), ARG (2), mode);
+      define_user_macro (ARG (1), ARG_LEN (1), arg_text (argv, 2, true), mode);
       break;
 
     case TOKEN_FUNC:
@@ -1012,7 +1008,10 @@ m4_defn (struct obstack *obs, int argc, macro_arguments 
*argv)
        }
       s = lookup_symbol (ARG (i), SYMBOL_LOOKUP);
       if (s == NULL)
-       continue;
+       {
+         m4_warn (0, me, _("undefined macro `%s'"), ARG (i));
+         continue;
+       }
 
       switch (SYMBOL_TYPE (s))
        {
@@ -1046,10 +1045,10 @@ m4_defn (struct obstack *obs, int argc, macro_arguments 
*argv)
 
 /* Helper macros for readability.  */
 #if UNIX || defined WEXITSTATUS
-# define M4SYSVAL_EXITBITS(status)                       \
-   (WIFEXITED (status) ? WEXITSTATUS (status) : 0)
-# define M4SYSVAL_TERMSIGBITS(status)                    \
-   (WIFSIGNALED (status) ? WTERMSIG (status) << 8 : 0)
+# define M4SYSVAL_EXITBITS(status)                      \
+  (WIFEXITED (status) ? WEXITSTATUS (status) : 0)
+# define M4SYSVAL_TERMSIGBITS(status)                   \
+  (WIFSIGNALED (status) ? WTERMSIG (status) << 8 : 0)
 
 #else /* !UNIX && !defined WEXITSTATUS */
 /* Platforms such as mingw do not support the notion of reporting
diff --git a/src/input.c b/src/input.c
index aacab61..df5a791 100644
--- a/src/input.c
+++ b/src/input.c
@@ -340,18 +340,18 @@ push_string_init (void)
 | rather than copying everything consecutively onto the input stack.  |
 | Must be called between push_string_init and push_string_finish.     |
 |                                                                     |
-| If TOKEN contains text, then convert the current input block into   |
-| a chain if it is not one already, and add the contents of TOKEN as  |
-| a new link in the chain.  LEVEL describes the current expansion     |
-| level, or -1 if TOKEN is composite, its contents reside entirely    |
-| on the current_input stack, and TOKEN lives in temporary storage.   |
-| If TOKEN is a simple string, then it belongs to the current macro   |
-| expansion.  If TOKEN is composite, then each text link has a level  |
-| of -1 if it belongs to the current macro expansion, otherwise it    |
-| is a back-reference where level tracks which stack it came from.    |
-| The resulting input block chain contains links with a level of -1   |
-| if the text belongs to the input stack, otherwise the level where   |
-| the back-reference comes from.                                     |
+| Convert the current input block into a chain if it is not one              |
+| already, and add the contents of TOKEN as a new link in the chain.  |
+| LEVEL describes the current expansion level, or -1 if TOKEN is      |
+| composite, its contents reside entirely on the current_input       |
+| stack, and TOKEN lives in temporary storage.  If TOKEN is a simple  |
+| string, then it belongs to the current macro expansion.  If TOKEN   |
+| is composite, then each text link has a level of -1 if it belongs   |
+| to the current macro expansion, otherwise it is a back-reference    |
+| where level tracks which stack it came from.  The resulting input   |
+| block chain contains links with a level of -1 if the text belongs   |
+| to the input stack, otherwise the level where the back-reference    |
+| comes from.                                                        |
 |                                                                     |
 | Return true only if a reference was created to the contents of      |
 | TOKEN, in which case, LEVEL was non-negative and the lifetime of    |
@@ -1062,19 +1062,35 @@ skip_line (const char *name)
 | When next_token() sees a builtin token with peek_input, this     |
 | retrieves the value of the function pointer, stores it in TD, and |
 | consumes the input so the caller does not need to do next_char.   |
-| If TD is NULL, discard the token instead.                        |
+| If OBS, TD will be converted to a composite token using storage   |
+| from OBS as necessary; otherwise, if TD is NULL, the builtin is   |
+| discarded.                                                        |
 `------------------------------------------------------------------*/
 
 static void
-init_macro_token (token_data *td)
+init_macro_token (struct obstack *obs, token_data *td)
 {
   token_chain *chain;
 
   assert (isp->type == INPUT_CHAIN);
   chain = isp->u.u_c.chain;
   assert (!chain->quote_age && chain->type == CHAIN_FUNC && chain->u.func);
-  if (td)
+  if (obs)
     {
+      assert (td);
+      if (TOKEN_DATA_TYPE (td) == TOKEN_VOID)
+       {
+         TOKEN_DATA_TYPE (td) = TOKEN_COMP;
+         td->u.u_c.chain = td->u.u_c.end = NULL;
+         td->u.u_c.wrapper = false;
+         td->u.u_c.has_func = true;
+       }
+      assert (TOKEN_DATA_TYPE (td) == TOKEN_COMP);
+      append_macro (obs, chain->u.func, &td->u.u_c.chain, &td->u.u_c.end);
+    }
+  else if (td)
+    {
+      assert (TOKEN_DATA_TYPE (td) == TOKEN_VOID);
       TOKEN_DATA_TYPE (td) = TOKEN_FUNC;
       TOKEN_DATA_FUNC (td) = chain->u.func;
     }
@@ -1597,7 +1613,7 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
     }
   if (ch == CHAR_MACRO)
     {
-      init_macro_token (td);
+      init_macro_token (obs, td);
 #ifdef DEBUG_INPUT
       xfprintf (stderr, "next_token -> MACDEF (%s)\n",
                find_builtin_by_addr (TOKEN_DATA_FUNC (td))->name);
@@ -1633,10 +1649,7 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
                              _("end of file in comment"));
          if (ch == CHAR_MACRO)
            {
-             /* TODO support concatenation of builtins.  */
-             m4_warn_at_line (0, file, *line, caller,
-                              _("cannot comment builtin"));
-             init_macro_token (NULL);
+             init_macro_token (obs, obs ? td : NULL);
              continue;
            }
          if (MATCH (ch, curr_comm.str2, true))
@@ -1732,35 +1745,8 @@ next_token (token_data *td, int *line, struct obstack 
*obs, bool allow_argv,
                              _("end of file in string"));
 
          if (ch == CHAR_MACRO)
-           {
-             /* TODO support concatenation of builtins.  */
-             if (obstack_object_size (obs_td) == 0
-                 && TOKEN_DATA_TYPE (td) == TOKEN_VOID)
-               {
-                 assert (quote_level == 1);
-                 init_macro_token (td);
-                 ch = peek_input (false);
-                 if (MATCH (ch, curr_quote.str2, false))
-                   {
-#ifdef DEBUG_INPUT
-                     const builtin *bp
-                       = find_builtin_by_addr (TOKEN_DATA_FUNC (td));
-                     xfprintf (stderr, "next_token -> MACDEF (%s)\n",
-                               bp->name);
-#endif
-                     ch = next_char (false, false);
-                     MATCH (ch, curr_quote.str2, true);
-                     return TOKEN_MACDEF;
-                   }
-                 TOKEN_DATA_TYPE (td) = TOKEN_VOID;
-               }
-             else
-               init_macro_token (NULL);
-             m4_warn_at_line (0, file, *line, caller,
-                              _("cannot quote builtin"));
-             continue;
-           }
-         if (ch == CHAR_QUOTE)
+           init_macro_token (obs, obs ? td : NULL);
+         else if (ch == CHAR_QUOTE)
            append_quote_token (obs, td);
          else if (MATCH (ch, curr_quote.str2, true))
            {
diff --git a/src/m4.c b/src/m4.c
index fe8c548..2ad82c2 100644
--- a/src/m4.c
+++ b/src/m4.c
@@ -573,7 +573,7 @@ main (int argc, char *const *argv, char *const *envp)
 
       case 'l':
        {
-         int tmp = atoi (optarg);
+         long tmp = strtol (optarg, NULL, 10);
          max_debug_argument_length = tmp <= 0 ? SIZE_MAX : (size_t) tmp;
        }
        break;
diff --git a/src/m4.h b/src/m4.h
index 59d9be3..b3cb7e1 100644
--- a/src/m4.h
+++ b/src/m4.h
@@ -266,7 +266,7 @@ enum token_type
   TOKEN_COMMA, /* Active character `,', TOKEN_TEXT.  */
   TOKEN_CLOSE, /* Active character `)', TOKEN_TEXT.  */
   TOKEN_SIMPLE,        /* Any other single character, TOKEN_TEXT.  */
-  TOKEN_MACDEF,        /* A macro's definition (see "defn"), TOKEN_FUNC.  */
+  TOKEN_MACDEF,        /* A builtin macro, TOKEN_FUNC or TOKEN_COMP.  */
   TOKEN_ARGV   /* A series of parameters, TOKEN_COMP.  */
 };
 
@@ -504,7 +504,7 @@ size_t adjust_refcount (int, bool);
 bool arg_adjust_refcount (macro_arguments *, bool);
 unsigned int arg_argc (macro_arguments *);
 token_data_type arg_type (macro_arguments *, unsigned int);
-const char *arg_text (macro_arguments *, unsigned int);
+const char *arg_text (macro_arguments *, unsigned int, bool);
 bool arg_equal (macro_arguments *, unsigned int, unsigned int);
 bool arg_empty (macro_arguments *, unsigned int);
 size_t arg_len (macro_arguments *, unsigned int);
@@ -523,7 +523,7 @@ void wrap_args (macro_arguments *);
 
 /* Grab the text at argv index I.  Assumes macro_argument *argv is in
    scope, and aborts if the argument is not text.  */
-#define ARG(i) arg_text (argv, i)
+#define ARG(i) arg_text (argv, i, false)
 
 /* Grab the text length at argv index I.  Assumes macro_argument *argv
    is in scope, and aborts if the argument is not text.  */
diff --git a/src/macro.c b/src/macro.c
index d871fc2..8290818 100644
--- a/src/macro.c
+++ b/src/macro.c
@@ -329,18 +329,6 @@ expand_token (struct obstack *obs, token_type t, 
token_data *td, int line,
 }
 
 
-/*---------------------------------------------------------------.
-| Helper function to print warning about concatenating FUNC with |
-| text.                                                          |
-`---------------------------------------------------------------*/
-static void
-warn_builtin_concat (const char *caller, builtin_func *func)
-{
-  const builtin *bp = find_builtin_by_addr (func);
-  assert (bp);
-  m4_warn (0, caller, _("cannot concatenate builtin `%s'"), bp->name);
-}
-
 /*-------------------------------------------------------------------.
 | This function parses one argument to a macro call.  It expects the |
 | first left parenthesis or the separating comma to have been read   |
@@ -383,15 +371,10 @@ expand_argument (struct obstack *obs, token_data *argp, 
const char *caller)
        case TOKEN_CLOSE:
          if (paren_level == 0)
            {
-             size_t len = obstack_object_size (obs);
-             if (TOKEN_DATA_TYPE (argp) == TOKEN_FUNC)
-               {
-                 if (!len)
-                   return t == TOKEN_COMMA;
-                 warn_builtin_concat (caller, TOKEN_DATA_FUNC (argp));
-               }
+             assert (TOKEN_DATA_TYPE (argp) != TOKEN_FUNC);
              if (TOKEN_DATA_TYPE (argp) != TOKEN_COMP)
                {
+                 size_t len = obstack_object_size (obs);
                  TOKEN_DATA_TYPE (argp) = TOKEN_TEXT;
                  if (len)
                    {
@@ -404,7 +387,16 @@ expand_argument (struct obstack *obs, token_data *argp, 
const char *caller)
                  TOKEN_DATA_QUOTE_AGE (argp) = age;
                }
              else
-               make_text_link (obs, NULL, &argp->u.u_c.end);
+               {
+                 make_text_link (obs, NULL, &argp->u.u_c.end);
+                 if (argp->u.u_c.chain == argp->u.u_c.end
+                     && argp->u.u_c.chain->type == CHAIN_FUNC)
+                   {
+                     builtin_func *func = argp->u.u_c.chain->u.func;
+                     TOKEN_DATA_TYPE (argp) = TOKEN_FUNC;
+                     TOKEN_DATA_FUNC (argp) = func;
+                   }
+               }
              return t == TOKEN_COMMA;
            }
          /* fallthru */
@@ -427,14 +419,13 @@ expand_argument (struct obstack *obs, token_data *argp, 
const char *caller)
 
        case TOKEN_WORD:
        case TOKEN_STRING:
+       case TOKEN_MACDEF:
          if (!expand_token (obs, t, &td, line, first))
            age = 0;
          if (TOKEN_DATA_TYPE (&td) == TOKEN_COMP)
            {
              if (TOKEN_DATA_TYPE (argp) != TOKEN_COMP)
                {
-                 if (TOKEN_DATA_TYPE (argp) == TOKEN_FUNC)
-                   warn_builtin_concat (caller, TOKEN_DATA_FUNC (argp));
                  TOKEN_DATA_TYPE (argp) = TOKEN_COMP;
                  argp->u.u_c.chain = td.u.u_c.chain;
                  argp->u.u_c.wrapper = argp->u.u_c.has_func = false;
@@ -450,22 +441,6 @@ expand_argument (struct obstack *obs, token_data *argp, 
const char *caller)
            }
          break;
 
-       case TOKEN_MACDEF:
-         if (TOKEN_DATA_TYPE (argp) == TOKEN_VOID
-             && obstack_object_size (obs) == 0)
-           {
-             TOKEN_DATA_TYPE (argp) = TOKEN_FUNC;
-             TOKEN_DATA_FUNC (argp) = TOKEN_DATA_FUNC (&td);
-           }
-         else
-           {
-             if (TOKEN_DATA_TYPE (argp) == TOKEN_FUNC)
-               warn_builtin_concat (caller, TOKEN_DATA_FUNC (argp));
-             warn_builtin_concat (caller, TOKEN_DATA_FUNC (&td));
-             TOKEN_DATA_TYPE (argp) = TOKEN_TEXT;
-           }
-         break;
-
        case TOKEN_ARGV:
          assert (paren_level == 0 && TOKEN_DATA_TYPE (argp) == TOKEN_VOID
                  && obstack_object_size (obs) == 0
@@ -798,6 +773,8 @@ arg_adjust_refcount (macro_arguments *argv, bool increase)
                  if (chain->u.u_s.level >= 0)
                    adjust_refcount (chain->u.u_s.level, increase);
                  break;
+               case CHAIN_FUNC:
+                 break;
                case CHAIN_ARGV:
                  assert (chain->u.u_a.argv->inuse);
                  arg_adjust_refcount (chain->u.u_a.argv, increase);
@@ -914,10 +891,11 @@ arg_type (macro_arguments *argv, unsigned int arg)
 
 /* Given ARGV, return the text at argument ARG.  Abort if the argument
    is not text.  Arg 0 is always text, and indices beyond argc return
-   the empty string.  The result is always NUL-terminated, even if it
-   includes embedded NUL characters.  */
+   the empty string.  If FLATTEN, builtins are ignored.  The result is
+   always NUL-terminated, even if it includes embedded NUL
+   characters.  */
 const char *
-arg_text (macro_arguments *argv, unsigned int arg)
+arg_text (macro_arguments *argv, unsigned int arg, bool flatten)
 {
   token_data *token;
   token_chain *chain;
@@ -927,7 +905,7 @@ arg_text (macro_arguments *argv, unsigned int arg)
     return argv->argv0;
   if (arg >= argv->argc)
     return "";
-  token = arg_token (argv, arg, NULL, false);
+  token = arg_token (argv, arg, NULL, flatten);
   switch (TOKEN_DATA_TYPE (token))
     {
     case TOKEN_TEXT:
@@ -942,13 +920,18 @@ arg_text (macro_arguments *argv, unsigned int arg)
            case CHAIN_STR:
              obstack_grow (obs, chain->u.u_s.str, chain->u.u_s.len);
              break;
+           case CHAIN_FUNC:
+             if (flatten)
+               break;
+             assert (!"arg_text");
+             abort ();
            case CHAIN_ARGV:
-             assert (!chain->u.u_a.has_func || argv->flatten);
+             assert (!chain->u.u_a.has_func || flatten || argv->flatten);
              arg_print (obs, chain->u.u_a.argv, chain->u.u_a.index,
                         quote_cache (NULL, chain->quote_age,
                                      chain->u.u_a.quotes),
-                        argv->flatten || chain->u.u_a.flatten, NULL, NULL,
-                        NULL, false);
+                        flatten || argv->flatten || chain->u.u_a.flatten,
+                        NULL, NULL, NULL, false);
              break;
            default:
              assert (!"arg_text");
@@ -969,7 +952,7 @@ arg_text (macro_arguments *argv, unsigned int arg)
 /* Given ARGV, compare text arguments INDEXA and INDEXB for equality.
    Both indices must be non-zero and less than argc.  Return true if
    the arguments contain the same contents; often more efficient than
-   strcmp (arg_text (argv, indexa), arg_text (argv, indexb)) == 0.  */
+   strcmp (arg_text (argv, a, 1), arg_text (argv, b, 1)) == 0.  */
 bool
 arg_equal (macro_arguments *argv, unsigned int indexa, unsigned int indexb)
 {
@@ -1249,6 +1232,7 @@ arg_print (struct obstack *obs, macro_arguments *argv, 
unsigned int arg,
   size_t sep_len;
   size_t *plen = quote_each ? NULL : &len;
 
+  flatten |= argv->flatten;
   if (chainp)
     assert (!max_len && *chainp);
   if (!sep)
diff --git a/src/output.c b/src/output.c
index d7f8570..ee1907b 100644
--- a/src/output.c
+++ b/src/output.c
@@ -199,7 +199,7 @@ m4_tmpname (int divnum)
       obstack_1grow (&diversion_storage, '-');
       offset = obstack_object_size (&diversion_storage);
       buffer = (char *) obstack_alloc (&diversion_storage,
-                                       INT_BUFSIZE_BOUND (divnum));
+                                      INT_BUFSIZE_BOUND (divnum));
     }
   if (snprintf (&buffer[offset], INT_BUFSIZE_BOUND (divnum), "%d", divnum) < 0)
     m4_error (EXIT_FAILURE, errno, NULL,
@@ -387,7 +387,7 @@ make_room_for (int length)
 
       /* The current buffer may be safely reallocated.  */
       output_diversion->u.buffer = xrealloc (output_diversion->u.buffer,
-                                             (size_t) wanted_size);
+                                            (size_t) wanted_size);
 
       total_buffer_size += wanted_size - output_diversion->size;
       output_diversion->size = wanted_size;
-- 
1.5.5.1


reply via email to

[Prev in Thread] Current Thread [Next in Thread]