m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[6/18] argv_ref speedup: push entire argument or argv series


From: Eric Blake
Subject: [6/18] argv_ref speedup: push entire argument or argv series
Date: Mon, 10 Dec 2007 06:42:01 -0700
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.9) Gecko/20071031 Thunderbird/2.0.0.9 Mnenhy/0.7.5.666

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The latest in the series.  There should be little, if any, impact to speed
or memory usage with this patch, since the bulk of it was just factoring
code into a common location.

The goal behind this patch is to make as many builtins as possible use the
new push_arg/push_args interface when requesting that an argument be
rescanned as part of the macro expansion, so that future patches only have
to touch the new method instead of every builtin when pushing a
back-reference to pre-scanned text.  I also factored obstack printing
code, making it easier to print a string with optional length limiting
(think the -l option), and simplifying code that was previously using 0
instead of INT_MAX (or SIZE_MAX) as the special case for unlimited length.

For the branch, I also borrowed an idea already present on head, that
push_string_finish should return a pointer into the input stack, rather
than a flattened string; this is necessary later when the input stack can
consist of several substrings at different addresses.

2007-12-10  Eric Blake  <address@hidden>

        Stage 6: convert builtins to push arg at a time.
        * src/m4.h (includes): Include <limits.h> here, instead of in
        individual files.
        (input_block): New typedef.
        (trace_pre, trace_post, push_string_finish): Update prototypes.
        (obstack_print, input_print, push_arg, push_args): New
        prototypes.
        * src/input.c (push_string_finish): Change return type.
        (input_print): New function.
        * src/debug.c (trace_format): Add %B specifier, and use new
        function.
        (trace_pre): Remove redundant argc parameter.
        (trace_post): Likewise, and change signature.
        (obstack_print): New function.
        * src/macro.c (expand_macro): Update caller.
        (push_arg, push_args): New functions.
        * src/builtin.c (m4_ifdef, m4_ifelse, m4_shift, m4_substr)
        (m4_patsubst, expand_user_macro): Use new functions.
        (mkstemp_helper, m4_maketemp): Avoid extra trailing NULs.
        * src/m4.c (max_debug_argument_length, main): Set to INT_MAX, not
        0, for unlimited.
        * src/output.c: Update includes.
        * src/symtab.c: Likewise.

- --
Don't work too hard, make some time for fun as well!

Eric Blake             address@hidden
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Cygwin)
Comment: Public key at home.comcast.net/~ericblake/eblake.gpg
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFHXUIp84KuGfSFAYARAr0wAKCpWquPuxSJUcLFXLlGNNa6rNzdegCg01to
4i16trk8IezCYeU4jTvb2YY=
=yfWE
-----END PGP SIGNATURE-----
>From a9d31f373b211f8790ee4b79c5e85943bf9fab9e Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Sat, 8 Dec 2007 21:05:11 -0700
Subject: [PATCH] Stage 6: convert builtins to push arg at a time.

* m4/m4module.h (m4_shipout_text): Rename...
(m4_divert_text): ...to this, to avoid confusion with m4_shipout_*
that does not worry about sync lines.
(m4_shipout_string_trunc): New prototype.
* m4/output.c (m4_shipout_text): Rename...
(m4_divert_text): ...to this.
(m4_shipout_string): Move guts...
(m4_shipout_string_trunc): ...to this new function.
* m4/macro.c (m4_push_arg, m4_push_args): New functions.
(expand_token, process_macro): Update callers.
* m4/input.c (string_print): Likewise.
* modules/m4.c (ifdef, ifelse, shift, substr, translit, divert):
Likewise.
* modules/gnu.c (patsubst): Likewise.
(debuglen): Use SIZE_MAX for unlimited debug length.
* src/main.c (main): Likewise.
* m4/m4.c (m4_create): Default max_debug_length to SIZE_MAX, not
zero.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog     |   22 ++++++++++++++++++++++
 m4/input.c    |   19 ++++---------------
 m4/m4.c       |    1 +
 m4/m4module.h |   22 ++++++++++++++--------
 m4/macro.c    |   57 ++++++++++++++++++++++++++++++++++++++++++++++++++++++---
 m4/output.c   |   39 +++++++++++++++++++++++++++++++--------
 modules/gnu.c |    6 ++++--
 modules/m4.c  |   27 ++++++++++++++++-----------
 src/main.c    |    2 ++
 9 files changed, 148 insertions(+), 47 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 07b471d..e79d63a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,25 @@
+2007-12-08  Eric Blake  <address@hidden>
+
+       Stage 6: convert builtins to push arg at a time.
+       * m4/m4module.h (m4_shipout_text): Rename...
+       (m4_divert_text): ...to this, to avoid confusion with m4_shipout_*
+       that does not worry about sync lines.
+       (m4_shipout_string_trunc): New prototype.
+       * m4/output.c (m4_shipout_text): Rename...
+       (m4_divert_text): ...to this.
+       (m4_shipout_string): Move guts...
+       (m4_shipout_string_trunc): ...to this new function.
+       * m4/macro.c (m4_push_arg, m4_push_args): New functions.
+       (expand_token, process_macro): Update callers.
+       * m4/input.c (string_print): Likewise.
+       * modules/m4.c (ifdef, ifelse, shift, substr, translit, divert):
+       Likewise.
+       * modules/gnu.c (patsubst): Likewise.
+       (debuglen): Use SIZE_MAX for unlimited debug length.
+       * src/main.c (main): Likewise.
+       * m4/m4.c (m4_create): Default max_debug_length to SIZE_MAX, not
+       zero.
+
 2007-12-07  Eric Blake  <address@hidden>
 
        Minor security fix: Quote output of mkstemp.
diff --git a/m4/input.c b/m4/input.c
index e4228a9..fd0e677 100644
--- a/m4/input.c
+++ b/m4/input.c
@@ -477,21 +477,10 @@ static void
 string_print (m4_input_block *me, m4 *context, m4_obstack *obs)
 {
   bool quote = m4_is_debug_bit (context, M4_DEBUG_TRACE_QUOTE);
-  const char *lquote = m4_get_syntax_lquote (M4SYNTAX);
-  const char *rquote = m4_get_syntax_rquote (M4SYNTAX);
   size_t arg_length = m4_get_max_debug_arg_length_opt (context);
-  const char *text = me->u.u_s.str;
-  size_t len = me->u.u_s.len;
-
-  if (arg_length && arg_length < len)
-    len = arg_length;
-  if (quote)
-    obstack_grow (obs, lquote, strlen (lquote));
-  obstack_grow (obs, text, len);
-  if (len != me->u.u_s.len)
-    obstack_grow (obs, "...", 3);
-  if (quote)
-    obstack_grow (obs, rquote, strlen (rquote));
+
+  m4_shipout_string_trunc (context, obs, me->u.u_s.str, me->u.u_s.len,
+                          quote, &arg_length);
 }
 
 /* First half of m4_push_string ().  The pointer next points to the new
@@ -543,7 +532,7 @@ m4_push_string_finish (void)
       input_change = true;
     }
   else
-    obstack_free (current_input, next); /* people might leave garbage on it. */
+    obstack_free (current_input, next);
   next = NULL;
   return ret;
 }
diff --git a/m4/m4.c b/m4/m4.c
index 7de0f86..651a943 100644
--- a/m4/m4.c
+++ b/m4/m4.c
@@ -37,6 +37,7 @@ m4_create (void)
   obstack_init (&context->trace_messages);
 
   context->nesting_limit = DEFAULT_NESTING_LIMIT;
+  context->max_debug_arg_length = SIZE_MAX;
 
   context->search_path  = xzalloc (sizeof *context->search_path);
   m4__include_init (context);
diff --git a/m4/m4module.h b/m4/m4module.h
index f70d414..0bab08f 100644
--- a/m4/m4module.h
+++ b/m4/m4module.h
@@ -312,6 +312,10 @@ extern size_t      m4_arg_len              (m4_macro_args 
*, unsigned int);
 extern m4_builtin_func *m4_arg_func    (m4_macro_args *, unsigned int);
 extern m4_macro_args *m4_make_argv_ref (m4_macro_args *, const char *, size_t,
                                         bool, bool);
+extern void    m4_push_arg             (m4 *, m4_obstack *, m4_macro_args *,
+                                        unsigned int);
+extern void    m4_push_args            (m4 *, m4_obstack *, m4_macro_args *,
+                                        bool, bool);
 
 
 /* --- RUNTIME DEBUGGING --- */
@@ -448,14 +452,16 @@ extern    void    m4_input_print  (m4 *, m4_obstack *, 
m4_input_block *);
 
 /* --- OUTPUT MANAGEMENT --- */
 
-extern void    m4_output_init    (m4 *);
-extern void    m4_output_exit    (void);
-extern void    m4_output_text    (m4 *, const char *, size_t);
-extern void    m4_shipout_text   (m4 *, m4_obstack *, const char *, size_t,
-                                  int);
-extern void    m4_shipout_int    (m4_obstack *, int);
-extern void    m4_shipout_string (m4 *, m4_obstack *, const char *,
-                                  size_t, bool);
+extern void    m4_output_init          (m4 *);
+extern void    m4_output_exit          (void);
+extern void    m4_output_text          (m4 *, const char *, size_t);
+extern void    m4_divert_text          (m4 *, m4_obstack *, const char *,
+                                        size_t, int);
+extern void    m4_shipout_int          (m4_obstack *, int);
+extern void    m4_shipout_string       (m4 *, m4_obstack *, const char *,
+                                        size_t, bool);
+extern bool    m4_shipout_string_trunc (m4 *, m4_obstack *, const char *,
+                                        size_t, bool, size_t *);
 
 extern void    m4_make_diversion    (m4 *, int);
 extern void    m4_insert_diversion  (m4 *, int);
diff --git a/m4/macro.c b/m4/macro.c
index 56d43ac..b3080eb 100644
--- a/m4/macro.c
+++ b/m4/macro.c
@@ -167,7 +167,7 @@ expand_token (m4 *context, m4_obstack *obs, m4__token_type 
type,
                && BIT_TEST (SYMBOL_FLAGS (symbol), VALUE_BLIND_ARGS_BIT)
                && !m4__next_token_is_open (context)))
          {
-           m4_shipout_text (context, obs, text, len, line);
+           m4_divert_text (context, obs, text, len, line);
            /* The word just output is unquoted, but we can trust the
               heuristics of safe_quote.  */
            return m4__safe_quotes (M4SYNTAX);
@@ -183,7 +183,7 @@ expand_token (m4 *context, m4_obstack *obs, m4__token_type 
type,
       assert (!"INTERNAL ERROR: bad token type in expand_token ()");
       abort ();
     }
-  m4_shipout_text (context, obs, text, m4_get_symbol_value_len (token), line);
+  m4_divert_text (context, obs, text, m4_get_symbol_value_len (token), line);
   return result;
 }
 
@@ -533,7 +533,7 @@ process_macro (m4 *context, m4_symbol_value *value, 
m4_obstack *obs,
 
        case '*':               /* all arguments */
        case '@':               /* ... same, but quoted */
-         m4_dump_args (context, obs, 1, argv, ",", *text == '@');
+         m4_push_args (context, obs, argv, false, *text == '@');
          text++;
          break;
 
@@ -950,6 +950,57 @@ m4_make_argv_ref (m4_macro_args *argv, const char *argv0, 
size_t argv0_len,
   return new_argv;
 }
 
+/* Push argument INDEX from ARGV, which must be a text token, onto the
+   expansion stack OBS for rescanning.  */
+void
+m4_push_arg (m4 *context, m4_obstack *obs, m4_macro_args *argv,
+            unsigned int index)
+{
+  m4_symbol_value *value;
+
+  if (index == 0)
+    {
+      obstack_grow (obs, argv->argv0, argv->argv0_len);
+      return;
+    }
+  value = m4_arg_symbol (argv, index);
+  if (value == &empty_symbol)
+    return;
+  /* TODO handle builtin tokens?  */
+  assert (value->type == M4_SYMBOL_TEXT);
+  /* TODO push a reference, rather than copying data.  */
+  obstack_grow (obs, m4_get_symbol_value_text (value),
+               m4_get_symbol_value_len (value));
+}
+
+/* Push series of comma-separated arguments from ARGV, which should
+   all be text, onto the expansion stack OBS for rescanning.  If SKIP,
+   then don't push the first argument.  If QUOTE, also push quoting
+   around each arg.  */
+void
+m4_push_args (m4 *context, m4_obstack *obs, m4_macro_args *argv, bool skip,
+             bool quote)
+{
+  m4_symbol_value *value;
+  unsigned int i;
+  bool comma = false;
+
+  /* TODO push reference, rather than copying data.  */
+  for (i = skip ? 2 : 1; i < argv->argc; i++)
+    {
+      value = m4_arg_symbol (argv, i);
+      if (comma)
+       obstack_1grow (obs, ',');
+      else
+       comma = true;
+      /* TODO handle builtin tokens?  */
+      assert (value->type == M4_SYMBOL_TEXT);
+      m4_shipout_string (context, obs, m4_get_symbol_value_text (value),
+                        m4_get_symbol_value_len (value), quote);
+    }
+}
+
+
 /* Define these last, so that earlier uses can benefit from the macros
    in m4private.h.  */
 
diff --git a/m4/output.c b/m4/output.c
index 8089073..ab46994 100644
--- a/m4/output.c
+++ b/m4/output.c
@@ -463,8 +463,8 @@ m4_output_text (m4 *context, const char *text, size_t 
length)
    generates several output lines, or when several input lines do not
    generate any output.  */
 void
-m4_shipout_text (m4 *context, m4_obstack *obs,
-                const char *text, size_t length, int line)
+m4_divert_text (m4 *context, m4_obstack *obs, const char *text, size_t length,
+               int line)
 {
   static bool start_of_output_line = true;
   char linebuf[6 + INT_BUFSIZE_BOUND (unsigned long int)]; /* "#line nnnn" */
@@ -590,20 +590,43 @@ void
 m4_shipout_string (m4 *context, m4_obstack *obs, const char *s, size_t len,
                   bool quoted)
 {
-  assert (obs);
-  if (s == NULL)
-    s = "";
+  m4_shipout_string_trunc (context, obs, s, len, quoted, NULL);
+}
+
+/* Output the text S, of length LEN, to OBS.  If QUOTED, also output
+   current quote characters around S.  If LEN is SIZE_MAX, use the
+   string length of S instead.  If MAX_LEN, reduce *MAX_LEN by LEN.
+   If LEN is larger than *MAX_LEN, then truncate output and return
+   true; otherwise return false.  */
+bool
+m4_shipout_string_trunc (m4 *context, m4_obstack *obs, const char *s,
+                        size_t len, bool quoted, size_t *max_len)
+{
+  size_t max = max_len ? *max_len : SIZE_MAX;
 
+  assert (obs && s);
   if (len == SIZE_MAX)
     len = strlen (s);
-
   if (quoted)
     obstack_grow (obs, context->syntax->lquote.string,
                  context->syntax->lquote.length);
-  obstack_grow (obs, s, len);
+  if (len < max)
+    {
+      obstack_grow (obs, s, len);
+      max -= len;
+    }
+  else
+    {
+      obstack_grow (obs, s, max);
+      obstack_grow (obs, "...", 3);
+      max = 0;
+    }
   if (quoted)
     obstack_grow (obs, context->syntax->rquote.string,
                  context->syntax->rquote.length);
+  if (max_len)
+    *max_len = max;
+  return max == 0;
 }
 
 
@@ -864,7 +887,7 @@ m4_freeze_diversions (m4 *context, FILE *file)
                          _("cannot stat diversion"));
              /* FIXME - support 64-bit off_t with 32-bit long, and
                 fix frozen file format to support 64-bit integers.
-                This implies fixing shipout_text to take off_t.  */
+                This implies fixing m4_divert_text to take off_t.  */
              if (file_stat.st_size < 0
                  || file_stat.st_size != (unsigned long int) file_stat.st_size)
                m4_error (context, EXIT_FAILURE, errno, NULL,
diff --git a/modules/gnu.c b/modules/gnu.c
index 3c772c5..7205727 100644
--- a/modules/gnu.c
+++ b/modules/gnu.c
@@ -557,11 +557,13 @@ M4BUILTIN_HANDLER (debugfile)
 M4BUILTIN_HANDLER (debuglen)
 {
   int i;
+  size_t s;
   if (!m4_numeric_arg (context, M4ARG (0), M4ARG (1), &i))
     return;
   /* FIXME - make m4_numeric_arg more powerful - we want to accept
      suffixes, and limit the result to size_t.  */
-  m4_set_max_debug_arg_length_opt (context, i);
+  s = i <= 0 ? SIZE_MAX : i;
+  m4_set_max_debug_arg_length_opt (context, s);
 }
 
 /* On-the-fly control of the format of the tracing output.  It takes one
@@ -739,7 +741,7 @@ M4BUILTIN_HANDLER (patsubst)
      replacement, we need not waste time with it.  */
   if (!*pattern && !*replace)
     {
-      obstack_grow (obs, M4ARG (1), m4_arg_len (argv, 1));
+      m4_push_arg (context, obs, argv, 1);
       return;
     }
 
diff --git a/modules/m4.c b/modules/m4.c
index 8d9bb9a..1cad9cb 100644
--- a/modules/m4.c
+++ b/modules/m4.c
@@ -223,8 +223,8 @@ M4BUILTIN_HANDLER (popdef)
 
 M4BUILTIN_HANDLER (ifdef)
 {
-  unsigned int index = m4_symbol_lookup (M4SYMTAB, M4ARG (1)) ? 2 : 3;
-  obstack_grow (obs, M4ARG (index), m4_arg_len (argv, index));
+  m4_push_arg (context, obs, argv,
+              m4_symbol_lookup (M4SYMTAB, M4ARG (1)) ? 2 : 3);
 }
 
 M4BUILTIN_HANDLER (ifelse)
@@ -243,11 +243,11 @@ M4BUILTIN_HANDLER (ifelse)
   index = 1;
   argc--;
 
-  while (1)
+  while (true)
     {
       if (m4_arg_equal (argv, index, index + 1))
        {
-         obstack_grow (obs, M4ARG (index + 2), m4_arg_len (argv, index + 2));
+         m4_push_arg (context, obs, argv, index + 2);
          return;
        }
       switch (argc)
@@ -257,7 +257,7 @@ M4BUILTIN_HANDLER (ifelse)
 
        case 4:
        case 5:
-         obstack_grow (obs, M4ARG (index + 3), m4_arg_len (argv, index + 3));
+         m4_push_arg (context, obs, argv, index + 3);
          return;
 
        default:
@@ -561,8 +561,8 @@ M4BUILTIN_HANDLER (divert)
   if (argc >= 2 && !m4_numeric_arg (context, M4ARG (0), M4ARG (1), &i))
     return;
   m4_make_diversion (context, i);
-  m4_shipout_text (context, NULL, M4ARG (2), m4_arg_len (argv, 2),
-                  m4_get_current_line (context));
+  m4_divert_text (context, NULL, M4ARG (2), m4_arg_len (argv, 2),
+                 m4_get_current_line (context));
 }
 
 /* Expand to the current diversion number.  */
@@ -625,7 +625,7 @@ M4BUILTIN_HANDLER (dnl)
    output argument is quoted with the current quotes.  */
 M4BUILTIN_HANDLER (shift)
 {
-  m4_dump_args (context, obs, 2, argv, ",", true);
+  m4_push_args (context, obs, argv, true, true);
 }
 
 /* Change the current quotes.  The function set_quotes () lives in
@@ -716,9 +716,8 @@ m4_make_temp (m4 *context, m4_obstack *obs, const char 
*macro,
   obstack_grow (obs, tmp, qlen);
   obstack_grow (obs, pattern, len);
   for (i = 0; len > 0 && i < 6; i++)
-    if (pattern[len - i - 1] != 'X')
+    if (pattern[--len] != 'X')
       break;
-  len += 6 - i;
   obstack_grow0 (obs, "XXXXXX", 6 - i);
   name = (char *) obstack_base (obs) + qlen;
 
@@ -936,7 +935,7 @@ M4BUILTIN_HANDLER (substr)
 
   if (argc <= 2)
     {
-      obstack_grow (obs, str, m4_arg_len (argv, 1));
+      m4_push_arg (context, obs, argv, 1);
       return;
     }
 
@@ -1012,6 +1011,12 @@ M4BUILTIN_HANDLER (translit)
   char found[256] = {0};
   unsigned char ch;
 
+  if (argc <= 2)
+    {
+      m4_push_arg (context, obs, argv, 1);
+      return;
+    }
+
   from = M4ARG (2);
   if (strchr (from, '-') != NULL)
     {
diff --git a/src/main.c b/src/main.c
index 7c35e64..344db58 100644
--- a/src/main.c
+++ b/src/main.c
@@ -558,6 +558,8 @@ main (int argc, char *const *argv, char *const *envp)
          /* fall through */
        case 'l':
          size = size_opt (optarg, oi, optchar);
+         if (!size)
+           size = SIZE_MAX;
          m4_set_max_debug_arg_length_opt (context, size);
          break;
 
-- 
1.5.3.5

>From d28166a2233b32f0f37bdd486a590a814209b765 Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Thu, 25 Oct 2007 08:27:28 -0600
Subject: [PATCH] Stage 6: convert builtins to push arg at a time.

* src/m4.h (includes): Include <limits.h> here, instead of in
individual files.
(input_block): New typedef.
(trace_pre, trace_post, push_string_finish): Update prototypes.
(obstack_print, input_print, push_arg, push_args): New
prototypes.
* src/input.c (push_string_finish): Change return type.
(input_print): New function.
* src/debug.c (trace_format): Add %B specifier, and use new
function.
(trace_pre): Remove redundant argc parameter.
(trace_post): Likewise, and change signature.
(obstack_print): New function.
* src/macro.c (expand_macro): Update caller.
(push_arg, push_args): New functions.
* src/builtin.c (m4_ifdef, m4_ifelse, m4_shift, m4_substr)
(m4_patsubst, expand_user_macro): Use new functions.
(mkstemp_helper, m4_maketemp): Avoid extra trailing NULs.
* src/m4.c (max_debug_argument_length, main): Set to INT_MAX, not
0, for unlimited.
* src/output.c: Update includes.
* src/symtab.c: Likewise.

(cherry picked from commit 6dcf7d2e3c5deac2d16ee9a29b6a307474603dc7)

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog     |   26 ++++++++++++++++++
 src/builtin.c |   53 ++++++++++++-------------------------
 src/debug.c   |   70 +++++++++++++++++++++++++++++++++++--------------
 src/input.c   |   80 ++++++++++++++++++++++++++++++++++++++++----------------
 src/m4.c      |    5 +--
 src/m4.h      |   14 ++++++++--
 src/macro.c   |   56 +++++++++++++++++++++++++++++++++++++--
 src/output.c  |    1 -
 src/symtab.c  |    1 -
 9 files changed, 216 insertions(+), 90 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 8838ac5..c6f5b46 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,29 @@
+2007-12-10  Eric Blake  <address@hidden>
+
+       Stage 6: convert builtins to push arg at a time.
+       * src/m4.h (includes): Include <limits.h> here, instead of in
+       individual files.
+       (input_block): New typedef.
+       (trace_pre, trace_post, push_string_finish): Update prototypes.
+       (obstack_print, input_print, push_arg, push_args): New
+       prototypes.
+       * src/input.c (push_string_finish): Change return type.
+       (input_print): New function.
+       * src/debug.c (trace_format): Add %B specifier, and use new
+       function.
+       (trace_pre): Remove redundant argc parameter.
+       (trace_post): Likewise, and change signature.
+       (obstack_print): New function.
+       * src/macro.c (expand_macro): Update caller.
+       (push_arg, push_args): New functions.
+       * src/builtin.c (m4_ifdef, m4_ifelse, m4_shift, m4_substr)
+       (m4_patsubst, expand_user_macro): Use new functions.
+       (mkstemp_helper, m4_maketemp): Avoid extra trailing NULs.
+       * src/m4.c (max_debug_argument_length, main): Set to INT_MAX, not
+       0, for unlimited.
+       * src/output.c: Update includes.
+       * src/symtab.c: Likewise.
+
 2007-12-07  Eric Blake  <address@hidden>
 
        Minor security fix: Quote output of mkstemp.
diff --git a/src/builtin.c b/src/builtin.c
index 87f8c2f..e8edc4b 100644
--- a/src/builtin.c
+++ b/src/builtin.c
@@ -731,28 +731,11 @@ static void
 m4_ifdef (struct obstack *obs, int argc, macro_arguments *argv)
 {
   symbol *s;
-  const char *result;
-  size_t len = 0;
 
   if (bad_argc (ARG (0), argc, 2, 3))
     return;
   s = lookup_symbol (ARG (1), SYMBOL_LOOKUP);
-
-  if (s != NULL && SYMBOL_TYPE (s) != TOKEN_VOID)
-    {
-      result = ARG (2);
-      len = arg_len (argv, 2);
-    }
-  else if (argc >= 4)
-    {
-      result = ARG (3);
-      len = arg_len (argv, 3);
-    }
-  else
-    result = NULL;
-
-  if (result != NULL)
-    obstack_grow (obs, result, len);
+  push_arg (obs, argv, (s && SYMBOL_TYPE (s) != TOKEN_VOID) ? 2 : 3);
 }
 
 static void
@@ -774,7 +757,7 @@ m4_ifelse (struct obstack *obs, int argc, macro_arguments 
*argv)
     {
       if (arg_equal (argv, index, index + 1))
        {
-         obstack_grow (obs, ARG (index + 2), arg_len (argv, index + 2));
+         push_arg (obs, argv, index + 2);
          return;
        }
       switch (argc)
@@ -784,7 +767,7 @@ m4_ifelse (struct obstack *obs, int argc, macro_arguments 
*argv)
 
        case 4:
        case 5:
-         obstack_grow (obs, ARG (index + 3), arg_len (argv, index + 3));
+         push_arg (obs, argv, index + 3);
          return;
 
        default:
@@ -1173,7 +1156,6 @@ m4_eval (struct obstack *obs, int argc, macro_arguments 
*argv)
        obstack_1grow (obs, '0');
       while (value-- != 0)
        obstack_1grow (obs, '1');
-      obstack_1grow (obs, '\0');
       return;
     }
 
@@ -1323,8 +1305,7 @@ m4_shift (struct obstack *obs, int argc, macro_arguments 
*argv)
 {
   if (bad_argc (ARG (0), argc, 1, -1))
     return;
-  /* TODO push a $@ reference.  */
-  dump_args (obs, 2, argv, ",", true);
+  push_args (obs, argv, true, true);
 }
 
 /*--------------------------------------------------------------------------.
@@ -1450,9 +1431,8 @@ mkstemp_helper (struct obstack *obs, const char *me, 
const char *pattern,
   obstack_grow (obs, lquote.string, lquote.length);
   obstack_grow (obs, pattern, len);
   for (i = 0; len > 0 && i < 6; i++)
-    if (pattern[len - i - 1] != 'X')
+    if (pattern[--len] != 'X')
       break;
-  len += 6 - i;
   obstack_grow0 (obs, "XXXXXX", 6 - i);
   name = (char *) obstack_base (obs) + lquote.length;
 
@@ -1505,7 +1485,7 @@ m4_maketemp (struct obstack *obs, int argc, 
macro_arguments *argv)
       str = ntoa ((int32_t) getpid (), 10);
       len2 = strlen (str);
       if (len2 > len - i)
-       obstack_grow0 (obs, str + len2 - (len - i), len - i);
+       obstack_grow (obs, str + len2 - (len - i), len - i);
       else
        {
          while (i++ < len - len2)
@@ -1823,7 +1803,7 @@ m4_substr (struct obstack *obs, int argc, macro_arguments 
*argv)
     {
       /* builtin(`substr') is blank, but substr(`abc') is abc.  */
       if (argc == 2)
-       obstack_grow (obs, ARG (1), arg_len (argv, 1));
+       push_arg (obs, argv, 1);
       return;
     }
 
@@ -1909,7 +1889,7 @@ m4_translit (struct obstack *obs, int argc, 
macro_arguments *argv)
     {
       /* builtin(`translit') is blank, but translit(`abc') is abc.  */
       if (argc == 2)
-       obstack_grow (obs, ARG (1), arg_len (argv, 1));
+       push_arg (obs, argv, 1);
       return;
     }
 
@@ -2146,7 +2126,7 @@ m4_patsubst (struct obstack *obs, int argc, 
macro_arguments *argv)
     {
       /* builtin(`patsubst') is blank, but patsubst(`abc') is abc.  */
       if (argc == 2)
-       obstack_grow (obs, ARG (1), arg_len (argv, 1));
+       push_arg (obs, argv, 1);
       return;
     }
 
@@ -2158,7 +2138,7 @@ m4_patsubst (struct obstack *obs, int argc, 
macro_arguments *argv)
      replacement, we need not waste time with it.  */
   if (!*regexp && !*repl)
     {
-      obstack_grow (obs, victim, arg_len (argv, 1));
+      push_arg (obs, argv, 1);
       return;
     }
 
@@ -2212,9 +2192,12 @@ m4_patsubst (struct obstack *obs, int argc, 
macro_arguments *argv)
 
       offset = regs->end[0];
       if (regs->start[0] == regs->end[0])
-       obstack_1grow (obs, victim[offset++]);
+       {
+         if (offset < length)
+           obstack_1grow (obs, victim[offset]);
+         offset++;
+       }
     }
-  obstack_1grow (obs, '\0');
 }
 
 /* Finally, a placeholder builtin.  This builtin is not installed by
@@ -2276,8 +2259,7 @@ expand_user_macro (struct obstack *obs, symbol *sym,
              for (i = 0; isdigit (to_uchar (*text)); text++)
                i = i * 10 + (*text - '0');
            }
-         if (i < argc)
-           obstack_grow (obs, ARG (i), arg_len (argv, i));
+         push_arg (obs, argv, i);
          break;
 
        case '#':               /* number of arguments */
@@ -2287,8 +2269,7 @@ expand_user_macro (struct obstack *obs, symbol *sym,
 
        case '*':               /* all arguments */
        case '@':               /* ... same, but quoted */
-         /* TODO push a $@ reference.  */
-         dump_args (obs, 1, argv, ",", *text == '@');
+         push_args (obs, argv, false, *text == '@');
          text++;
          break;
 
diff --git a/src/debug.c b/src/debug.c
index c4f701d..2ca7a0d 100644
--- a/src/debug.c
+++ b/src/debug.c
@@ -239,21 +239,21 @@ debug_message_prefix (void)
    output from interfering with other debug messages generated by the
    various builtins.  */
 
-/*---------------------------------------------------------------------.
-| Tracing output is formatted here, by a simplified printf-to-obstack  |
-| function trace_format ().  Understands only %S, %s, %d, %l (optional |
-| left quote) and %r (optional right quote).                          |
-`---------------------------------------------------------------------*/
+/*-------------------------------------------------------------------.
+| Tracing output to the obstack is formatted here, by a simplified   |
+| printf-like function trace_format ().  Understands only %B (1 arg: |
+| input block), %S (1 arg: length-limited text), %s (1 arg: text),   |
+| %d (1 arg: integer), %l (0 args: optional left quote) and %r (0    |
+| args: optional right quote).                                       |
+`-------------------------------------------------------------------*/
 
 static void
 trace_format (const char *fmt, ...)
 {
   va_list args;
   char ch;
-
   int d;
   const char *s;
-  int slen;
   int maxlen;
 
   va_start (args, fmt);
@@ -266,9 +266,14 @@ trace_format (const char *fmt, ...)
       if (ch == '\0')
        break;
 
-      maxlen = 0;
+      maxlen = INT_MAX;
       switch (*fmt++)
        {
+       case 'B':
+         s = "";
+         input_print (&trace, va_arg (args, input_block *));
+         break;
+
        case 'S':
          maxlen = max_debug_argument_length;
          /* fall through */
@@ -295,14 +300,8 @@ trace_format (const char *fmt, ...)
          break;
        }
 
-      slen = strlen (s);
-      if (maxlen == 0 || maxlen > slen)
-       obstack_grow (&trace, s, slen);
-      else
-       {
-         obstack_grow (&trace, s, maxlen);
-         obstack_grow (&trace, "...", 3);
-       }
+      if (obstack_print (&trace, s, SIZE_MAX, &maxlen))
+       break;
     }
 
   va_end (args);
@@ -362,10 +361,11 @@ trace_prepre (const char *name, int id)
 `-----------------------------------------------------------------------*/
 
 void
-trace_pre (const char *name, int id, int argc, macro_arguments *argv)
+trace_pre (const char *name, int id, macro_arguments *argv)
 {
   int i;
   const builtin *bp;
+  int argc = arg_argc (argv);
 
   trace_header (id);
   trace_format ("%s", name);
@@ -417,9 +417,11 @@ trace_pre (const char *name, int id, int argc, 
macro_arguments *argv)
 `-------------------------------------------------------------------*/
 
 void
-trace_post (const char *name, int id, int argc, macro_arguments *argv,
-           const char *expanded)
+trace_post (const char *name, int id, macro_arguments *argv,
+           const input_block *expanded)
 {
+  int argc = arg_argc (argv);
+
   if (debug_level & DEBUG_TRACE_CALL)
     {
       trace_header (id);
@@ -427,6 +429,34 @@ trace_post (const char *name, int id, int argc, 
macro_arguments *argv,
     }
 
   if (expanded && (debug_level & DEBUG_TRACE_EXPANSION))
-    trace_format (" -> %l%S%r", expanded);
+    trace_format (" -> %l%B%r", expanded);
   trace_flush ();
 }
+
+/* Dump the string STR of length LEN to the obstack OBS.  If LEN is
+   SIZE_MAX, use strlen (STR) instead.  If MAX_LEN is non-NULL,
+   truncate the dump at MAX_LEN bytes and return true if MAX_LEN was
+   reached; otherwise, return false and update MAX_LEN as
+   appropriate.  */
+bool
+obstack_print (struct obstack *obs, const char *str, size_t len, int *max_len)
+{
+  int max = max_len ? *max_len : INT_MAX;
+
+  if (len == SIZE_MAX)
+    len = strlen (str);
+  if (len < max)
+    {
+      obstack_grow (obs, str, len);
+      max -= len;
+    }
+  else
+    {
+      obstack_grow (obs, str, max);
+      obstack_grow (obs, "...", 3);
+      max = 0;
+    }
+  if (max_len)
+    *max_len = max;
+  return max == 0;
+}
diff --git a/src/input.c b/src/input.c
index 551b43d..4e5d299 100644
--- a/src/input.c
+++ b/src/input.c
@@ -77,7 +77,7 @@ typedef enum input_type input_type;
 /* A block of input to be scanned.  */
 struct input_block
 {
-  struct input_block *prev;    /* Previous input_block on the input stack.  */
+  input_block *prev;           /* Previous input_block on the input stack.  */
   input_type type;             /* See enum values.  */
   const char *file;            /* File where this input is from.  */
   int line;                    /* Line where this input is from.  */
@@ -101,8 +101,6 @@ struct input_block
   u;
 };
 
-typedef struct input_block input_block;
-
 
 /* Current input file name.  */
 const char *current_file;
@@ -208,8 +206,7 @@ push_file (FILE *fp, const char *title, bool close)
   if (debug_level & DEBUG_TRACE_INPUT)
     DEBUG_MESSAGE1 ("input read from %s", title);
 
-  i = (input_block *) obstack_alloc (current_input,
-                                    sizeof (struct input_block));
+  i = (input_block *) obstack_alloc (current_input, sizeof *i);
   i->type = INPUT_FILE;
   i->file = (char *) obstack_copy0 (&file_names, title, strlen (title));
   i->line = 1;
@@ -242,8 +239,7 @@ push_macro (builtin_func *func)
       next = NULL;
     }
 
-  i = (input_block *) obstack_alloc (current_input,
-                                    sizeof (struct input_block));
+  i = (input_block *) obstack_alloc (current_input, sizeof *i);
   i->type = INPUT_MACRO;
   i->file = current_file;
   i->line = current_line;
@@ -267,8 +263,7 @@ push_string_init (void)
   while (isp && pop_input (false));
 
   /* Reserve the next location on the obstack.  */
-  next = (input_block *) obstack_alloc (current_input,
-                                       sizeof (struct input_block));
+  next = (input_block *) obstack_alloc (current_input, sizeof *next);
   next->type = INPUT_STRING;
   next->file = current_file;
   next->line = current_line;
@@ -281,30 +276,35 @@ push_string_init (void)
 | push_file () or push_macro () has invalidated the previous call to |
 | push_string_init (), so we just give up.  If the new object is     |
 | void, we do not push it.  The function push_string_finish ()       |
-| returns a pointer to the finished object.  This pointer is only    |
-| for temporary use, since reading the next token might release the  |
-| memory used for the object.                                        |
+| returns an opaque pointer to the finished object, which can then   |
+| be printed with input_print when tracing is enabled.  This pointer |
+| is only for temporary use, since reading the next token will       |
+| invalidate the object.                                             |
 `-------------------------------------------------------------------*/
 
-const char *
+const input_block *
 push_string_finish (void)
 {
-  const char *ret = NULL;
+  input_block *ret = NULL;
+  size_t len = obstack_object_size (current_input);
 
   if (next == NULL)
-    return NULL;
+    {
+      assert (!len);
+      return NULL;
+    }
 
-  if (obstack_object_size (current_input) > 0)
+  if (len)
     {
       obstack_1grow (current_input, '\0');
       next->u.u_s.string = (char *) obstack_finish (current_input);
       next->prev = isp;
       isp = next;
-      ret = isp->u.u_s.string; /* for immediate use only */
       input_change = true;
+      ret = isp;
     }
   else
-    obstack_free (current_input, next); /* people might leave garbage on it. */
+    obstack_free (current_input, next);
   next = NULL;
   return ret;
 }
@@ -322,8 +322,7 @@ void
 push_wrapup (const char *s)
 {
   input_block *i;
-  i = (input_block *) obstack_alloc (wrapup_stack,
-                                    sizeof (struct input_block));
+  i = (input_block *) obstack_alloc (wrapup_stack, sizeof *i);
   i->prev = wsp;
   i->type = INPUT_STRING;
   i->file = current_file;
@@ -421,7 +420,7 @@ pop_wrapup (void)
     }
 
   current_input = wrapup_stack;
-  wrapup_stack = (struct obstack *) xmalloc (sizeof (struct obstack));
+  wrapup_stack = (struct obstack *) xmalloc (sizeof *wrapup_stack);
   obstack_init (wrapup_stack);
 
   isp = wsp;
@@ -443,6 +442,41 @@ init_macro_token (token_data *td)
   TOKEN_DATA_TYPE (td) = TOKEN_FUNC;
   TOKEN_DATA_FUNC (td) = isp->u.func;
 }
+
+/*--------------------------------------------------------------.
+| Dump a representation of INPUT to the obstack OBS, for use in |
+| tracing.                                                      |
+`--------------------------------------------------------------*/
+void
+input_print (struct obstack *obs, const input_block *input)
+{
+  int maxlen = max_debug_argument_length;
+
+  assert (input);
+  switch (input->type)
+    {
+    case INPUT_STRING:
+      obstack_print (obs, input->u.u_s.string, SIZE_MAX, &maxlen);
+      break;
+    case INPUT_FILE:
+      obstack_grow (obs, "<file: ", strlen ("<file: "));
+      obstack_grow (obs, input->file, strlen (input->file));
+      obstack_1grow (obs, '>');
+      break;
+    case INPUT_MACRO:
+      {
+       const builtin *bp = find_builtin_by_addr (input->u.func);
+       assert (bp);
+       obstack_1grow (obs, '<');
+       obstack_grow (obs, bp->name, strlen (bp->name));
+       obstack_1grow (obs, '>');
+      }
+      break;
+    default:
+      assert (!"input_print");
+      abort ();
+    }
+}
 
 
 /*-----------------------------------------------------------------.
@@ -680,9 +714,9 @@ input_init (void)
   current_file = "";
   current_line = 0;
 
-  current_input = (struct obstack *) xmalloc (sizeof (struct obstack));
+  current_input = (struct obstack *) xmalloc (sizeof *current_input);
   obstack_init (current_input);
-  wrapup_stack = (struct obstack *) xmalloc (sizeof (struct obstack));
+  wrapup_stack = (struct obstack *) xmalloc (sizeof *wrapup_stack);
   obstack_init (wrapup_stack);
 
   obstack_init (&file_names);
diff --git a/src/m4.c b/src/m4.c
index 0c7f33f..2cfed19 100644
--- a/src/m4.c
+++ b/src/m4.c
@@ -22,7 +22,6 @@
 #include "m4.h"
 
 #include <getopt.h>
-#include <limits.h>
 #include <signal.h>
 #include <stdarg.h>
 
@@ -48,7 +47,7 @@ int no_gnu_extensions = 0;
 int prefix_all_builtins = 0;
 
 /* Max length of arguments in trace output (-lsize).  */
-int max_debug_argument_length = 0;
+int max_debug_argument_length = INT_MAX;
 
 /* Suppress warnings about missing arguments.  */
 int suppress_warnings = 0;
@@ -553,7 +552,7 @@ main (int argc, char *const *argv, char *const *envp)
       case 'l':
        max_debug_argument_length = atoi (optarg);
        if (max_debug_argument_length <= 0)
-         max_debug_argument_length = 0;
+         max_debug_argument_length = INT_MAX;
        break;
 
       case 'o':
diff --git a/src/m4.h b/src/m4.h
index d7b6e08..f7b0d37 100644
--- a/src/m4.h
+++ b/src/m4.h
@@ -28,6 +28,7 @@
 #include <assert.h>
 #include <ctype.h>
 #include <errno.h>
+#include <limits.h>
 #include <stdbool.h>
 #include <stdint.h>
 #include <string.h>
@@ -96,6 +97,7 @@ typedef struct string STRING;
    (OBS)->object_base = (char *) (OBJECT))
 
 /* These must come first.  */
+typedef struct input_block input_block;
 typedef struct token_data token_data;
 typedef struct macro_arguments macro_arguments;
 typedef void builtin_func (struct obstack *, int, macro_arguments *);
@@ -245,8 +247,11 @@ bool debug_set_output (const char *, const char *);
 void debug_message_prefix (void);
 
 void trace_prepre (const char *, int);
-void trace_pre (const char *, int, int, macro_arguments *);
-void trace_post (const char *, int, int, macro_arguments *, const char *);
+void trace_pre (const char *, int, macro_arguments *);
+void trace_post (const char *, int, macro_arguments *,
+                const input_block *);
+
+bool obstack_print (struct obstack *, const char *, size_t, int *);
 
 /* File: input.c  --- lexical definitions.  */
 
@@ -341,9 +346,10 @@ void skip_line (const char *);
 void push_file (FILE *, const char *, bool);
 void push_macro (builtin_func *);
 struct obstack *push_string_init (void);
-const char *push_string_finish (void);
+const input_block *push_string_finish (void);
 void push_wrapup (const char *);
 bool pop_wrapup (void);
+void input_print (struct obstack *, const input_block *);
 
 /* current input file, and line */
 extern const char *current_file;
@@ -447,6 +453,8 @@ size_t arg_len (macro_arguments *, unsigned int);
 builtin_func *arg_func (macro_arguments *, unsigned int);
 macro_arguments *make_argv_ref (macro_arguments *, const char *, size_t,
                                bool, bool);
+void push_arg (struct obstack *, macro_arguments *, unsigned int);
+void push_args (struct obstack *, macro_arguments *, bool, bool);
 
 
 /* File: builtin.c  --- builtins.  */
diff --git a/src/macro.c b/src/macro.c
index ec43bc1..e5c7099 100644
--- a/src/macro.c
+++ b/src/macro.c
@@ -415,7 +415,7 @@ expand_macro (symbol *sym)
   unsigned int argv_size;      /* Size of argv_stack on entry.  */
   macro_arguments *argv;
   struct obstack *expansion;
-  const char *expanded;
+  const input_block *expanded;
   bool traced;
   int my_call_id;
 
@@ -459,14 +459,14 @@ expand_macro (symbol *sym)
   current_line = loc_open_line;
 
   if (traced)
-    trace_pre (SYMBOL_NAME (sym), my_call_id, argv->argc, argv);
+    trace_pre (SYMBOL_NAME (sym), my_call_id, argv);
 
   expansion = push_string_init ();
   call_macro (sym, argv->argc, argv, expansion);
   expanded = push_string_finish ();
 
   if (traced)
-    trace_post (SYMBOL_NAME (sym), my_call_id, argv->argc, argv, expanded);
+    trace_post (SYMBOL_NAME (sym), my_call_id, argv, expanded);
 
   current_file = loc_close_file;
   current_line = loc_close_line;
@@ -709,3 +709,53 @@ make_argv_ref (macro_arguments *argv, const char *argv0, 
size_t argv0_len,
   new_argv->quote_age = argv->quote_age;
   return new_argv;
 }
+
+/* Push argument INDEX from ARGV, which must be a text token, onto the
+   expansion stack OBS for rescanning.  */
+void
+push_arg (struct obstack *obs, macro_arguments *argv, unsigned int index)
+{
+  token_data *token;
+
+  if (index == 0)
+    {
+      obstack_grow (obs, argv->argv0, argv->argv0_len);
+      return;
+    }
+  if (index >= argv->argc)
+    return;
+  token = arg_token (argv, index);
+  /* TODO handle func tokens?  */
+  assert (TOKEN_DATA_TYPE (token) == TOKEN_TEXT);
+  /* TODO push a reference, rather than copying data.  */
+  obstack_grow (obs, TOKEN_DATA_TEXT (token), TOKEN_DATA_LEN (token));
+}
+
+/* Push series of comma-separated arguments from ARGV, which should
+   all be text, onto the expansion stack OBS for rescanning.  If SKIP,
+   then don't push the first argument.  If QUOTE, the rescan also
+   includes quoting around each arg.  */
+void
+push_args (struct obstack *obs, macro_arguments *argv, bool skip, bool quote)
+{
+  token_data *token;
+  unsigned int i;
+  bool comma = false;
+
+  /* TODO push reference, rather than copying data.  */
+  for (i = skip ? 2 : 1; i < argv->argc; i++)
+    {
+      token = arg_token (argv, i);
+      if (comma)
+       obstack_1grow (obs, ',');
+      else
+       comma = true;
+      /* TODO handle func tokens?  */
+      assert (TOKEN_DATA_TYPE (token) == TOKEN_TEXT);
+      if (quote)
+       obstack_grow (obs, lquote.string, lquote.length);
+      obstack_grow (obs, TOKEN_DATA_TEXT (token), TOKEN_DATA_LEN (token));
+      if (quote)
+       obstack_grow (obs, rquote.string, rquote.length);
+    }
+}
diff --git a/src/output.c b/src/output.c
index 478d3b2..4c8c9de 100644
--- a/src/output.c
+++ b/src/output.c
@@ -21,7 +21,6 @@
 
 #include "m4.h"
 
-#include <limits.h>
 #include <sys/stat.h>
 
 #include "gl_avltree_oset.h"
diff --git a/src/symtab.c b/src/symtab.c
index d65d4c5..e8a027f 100644
--- a/src/symtab.c
+++ b/src/symtab.c
@@ -32,7 +32,6 @@
    will then always be the first found.  */
 
 #include "m4.h"
-#include <limits.h>
 
 #ifdef DEBUG_SYM
 /* When evaluating hash table performance, this profiling code shows
-- 
1.5.3.5


reply via email to

[Prev in Thread] Current Thread [Next in Thread]