m4-patches
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

fix frozen diversions on master branch


From: Eric Blake
Subject: fix frozen diversions on master branch
Date: Tue, 13 May 2008 17:39:14 +0000 (UTC)
User-agent: Loom/3.14 (http://gmane.org/)

Jim's recent complaint about not detecting numeric overflow made me realize 
that the master branch has been broken for more than a year when it comes to 
freezing a diversion that contains a \.  Fixed as follows.

I'm also thinking of changing the frozen file format 2 to require newlines 
between strings.  For example, we now have:

F3,3,2
dnldnlm4

as the designator for a macro named 'dnl', pointing to the builtin 'dnl' in 
module 'm4'.  But if we instead used:

F3,3,2
dnl
dnl
m4

then I could use getndelim2 (rapidly searching for the next \ escape or the 
next newline), instead of fgetc (one byte at a time), resulting in a speedup by 
processing in blocks instead of bytes (I recently posted a 40% speedup to 
coreutils' cut doing just that).  Parsing a frozen file isn't necessarily m4's 
biggest bottleneck, but when you consider that autoconf.m4f from autoconf 2.62 
is over 400k, improving the reload speed will probably be noticeable.

I also realized that because this patch uses the quotearg module, and quotearg 
does not pad out a trailing NUL as '\000', then without the newline between 
strings, this would be ambiguous (macro '\0' with content '\1', or macro '\1' 
with content '\n'?) once argv_ref patch 24 allows NUL in macro names:

T1,1
\01

I also noticed that Gary's include-dso branch mentions the idea of attempting 
the .m4f suffix on an included file; before that will work, we need to modify 
the reload engine to be able to reload more than one file and at more than just 
m4 initialization.  Perhaps this means modifying the version 2 frozen file 
format to include a magic number in the first few bytes, so we can tell 
(independently of file suffix) whether a given file is likely to be a frozen 
file rather than normal text?

>From 4fab2788576b9e10374138be453620a05c95c7be Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Tue, 13 May 2008 06:25:46 -0600
Subject: [PATCH] Improve error message when frozen file is invalid.

* src/freeze.c (decode_char): Add parameter.  Allow \<newline>
line continuations.
(reload_frozen_state): Track current line.
* tests/freeze.at (loading format 1, loading format 2): Update to
test this.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog       |    9 ++++++++
 src/freeze.c    |   57 ++++++++++++++++++++++++++++++++++++++++++++++--------
 tests/freeze.at |   39 ++++++++++++++++++++++++++++---------
 3 files changed, 86 insertions(+), 19 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 05234af..4bee882 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2008-05-13  Eric Blake  <address@hidden>
+
+       Improve error message when frozen file is invalid.
+       * src/freeze.c (decode_char): Add parameter.  Allow \<newline>
+       line continuations.
+       (reload_frozen_state): Track current line.
+       * tests/freeze.at (loading format 1, loading format 2): Update to
+       test this.
+
 2008-05-10  Eric Blake  <address@hidden>
 
        Detect integer overflow when loading frozen file.
diff --git a/src/freeze.c b/src/freeze.c
index 186b69b..897a246 100644
--- a/src/freeze.c
+++ b/src/freeze.c
@@ -34,7 +34,7 @@ static        void  produce_symbol_dump       (m4 *, FILE *, 
m4_symbol_table *);
 static void *dump_symbol_CB            (m4_symbol_table *, const char *,
                                         m4_symbol *, void *);
 static void  issue_expect_message      (m4 *, int);
-static int   decode_char               (FILE *);
+static int   decode_char               (m4 *, FILE *, bool *);
 
 
 /* Dump an ASCII-encoded representation of LEN bytes at MEM to FILE.
@@ -294,13 +294,19 @@ issue_expect_message (m4 *context, int expected)
    unrecognized escape sequence.  */
 
 static int
-decode_char (FILE *in)
+decode_char (m4 *context, FILE *in, bool *advance_line)
 {
   int ch = getc (in);
   int next;
   int value = 0;
 
-  if (ch == '\\')
+  if (*advance_line)
+    {
+      m4_set_current_line (context, m4_get_current_line (context) + 1);
+      *advance_line = false;
+    }
+
+  while (ch == '\\')
     {
       ch = getc (in);
       switch (ch)
@@ -314,6 +320,11 @@ decode_char (FILE *in)
        case 'v': return '\v';
        case '\\': return '\\';
 
+       case '\n':
+         ch = getc (in);
+         m4_set_current_line (context, m4_get_current_line (context) + 1);
+         continue;
+
        case 'x': case 'X':
          next = getc (in);
          if (next >= '0' && next <= '9')
@@ -360,6 +371,8 @@ decode_char (FILE *in)
        }
     }
 
+  if (ch == '\n')
+    *advance_line = true;
   return ch;
 }
 
@@ -378,9 +391,22 @@ reload_frozen_state (m4 *context, const char *name)
   char *string[3];
   size_t allocated[3];
   int number[3] = {0};
-
-#define GET_CHARACTER                                          \
-  (character = getc (file))
+  bool advance_line = true;
+
+#define GET_CHARACTER                                                  \
+  do                                                                   \
+    {                                                                  \
+      if (advance_line)                                                
        \
+       {                                                               \
+         m4_set_current_line (context,                                 \
+                              m4_get_current_line (context) + 1);      \
+         advance_line = false;                                         \
+       }                                                               \
+      character = getc (file);                                         \
+      if (character == '\n')                                           \
+       advance_line = true;                                            \
+    }                                                                  \
+  while (0)
 
 #define GET_NUMBER(Number, AllowNeg)                           \
   do                                                           \
@@ -404,12 +430,14 @@ reload_frozen_state (m4 *context, const char *name)
     {                                                          \
       size_t len = (StrLen);                                   \
       char *p;                                                 \
+      int ch;                                                  \
       CHECK_ALLOCATION ((Buf), (BufSize), len);                        \
       p = (Buf);                                               \
       while (len-- > 0)                                                \
        {                                                       \
-         int ch = (version > 1 ? decode_char (File)            \
-                   : getc (File));                             \
+         ch = (version > 1                                     \
+               ? decode_char (context, File, &advance_line)    \
+               : getc (File));                                 \
          if (ch == EOF)                                        \
            m4_error (context, EXIT_FAILURE, 0, NULL,           \
                      _("premature end of frozen file"));       \
@@ -452,12 +480,19 @@ reload_frozen_state (m4 *context, const char *name)
            GET_CHARACTER;                                      \
          VALIDATE ('\n');                                      \
        }                                                       \
+      else if (character == '\\')                              \
+       {                                                       \
+         GET_CHARACTER;                                        \
+         VALIDATE ('\n');                                      \
+         continue;                                             \
+       }                                                       \
     }                                                          \
   while (character == '\n')
 
   file = m4_path_search (context, name, (char **)NULL);
   if (file == NULL)
     m4_error (context, EXIT_FAILURE, errno, NULL, _("cannot open `%s'"), name);
+  m4_set_current_file (context, name);
 
   allocated[0] = 100;
   string[0] = xcharalloc (allocated[0]);
@@ -773,7 +808,11 @@ ill-formed frozen file, version 2 directive `%c' 
encountered"), 'T');
   free (string[0]);
   free (string[1]);
   free (string[2]);
-  fclose (file);
+  if (ferror (file) || fclose (file) != 0)
+    m4_error (context, EXIT_FAILURE, errno, NULL,
+             _("unable to read frozen state"));
+  m4_set_current_file (context, NULL);
+  m4_set_current_line (context, 0);
 
 #undef GET_STRING
 #undef GET_CHARACTER
diff --git a/tests/freeze.at b/tests/freeze.at
index 56933b7..e43af6c 100644
--- a/tests/freeze.at
+++ b/tests/freeze.at
@@ -138,6 +138,15 @@ bar${1}
 [[m4:input.m4:5: Warning: popdef: undefined macro `my_define'
 ]])
 
+dnl Test rejection of v2 features in a v1 frozen file
+AT_DATA([bogus.m4f], [[V1
+M2
+m4
+]])
+AT_CHECK_M4([-R bogus.m4f], [1], [],
+[[m4:bogus.m4f:2: ill-formed frozen file, version 2 directive `M' encountered
+]])
+
 AT_CLEANUP
 
 
@@ -167,6 +176,10 @@ builtinbuiltingnu
 # introduced 2007-05-28 and fixed 2007-05-31.
 D-1,5
 12345
+# Check line continuations.
+D1,3
+a\n\
+b
 # Zero can be implied
 D,
 
@@ -193,13 +206,15 @@ AT_CHECK_M4([-R frozen.m4f input.m4], [0],
 
 bar
        '7 \
-]])
+a
+b]])
 
 dnl We don't support anything larger than format 2; make sure of that...
-AT_DATA([bogus.m4f], [[V3
+AT_DATA([bogus.m4f], [[# comments aren't continued\
+V3
 ]])
 AT_CHECK_M4([-R bogus.m4f], [63], [],
-[[m4: frozen file version 3 greater than max supported of 2
+[[m4:bogus.m4f:2: frozen file version 3 greater than max supported of 2
 ]])
 
 dnl Check that V appears.
@@ -207,7 +222,7 @@ AT_DATA([bogus.m4f], [[# not really a frozen file
 oops
 ]])
 AT_CHECK_M4([-R bogus.m4f], [1], [],
-[[m4: expecting character `V' in frozen file
+[[m4:bogus.m4f:2: expecting character `V' in frozen file
 ]])
 
 dnl M4_DIVNUM_TEST(number, [out-of-bounds])
@@ -220,8 +235,12 @@ M2
 m4
 M3
 gnu
+T1,5
+a\n\n\n\n\n
 F6,6,2
-divnumdivnumm4
+divnum\
+divnumm4\
+
 F6,6,2
 divertdivertm4
 F6,6,2
@@ -231,16 +250,16 @@ hi
 
 ]])
 AT_CHECK_M4([-R frozen.m4f in.m4], m4_ifval([$2], [1], [0]),
-m4_ifval([$2], [], [[$1
-]m4_if(m4_substr([$1], [0], [1]), [-], [], [[hi
-]])]), m4_ifval([$2], [[m4: integer overflow in frozen file
+m4_ifval([$2], [], [m4_bpatsubst([$1], [^0*])
+m4_if(m4_substr([$1], [0], [1]), [-], [], [[hi
+]])]), m4_ifval([$2], [[m4:frozen.m4f:16: integer overflow in frozen file
 ]]))
 ])
 
 AT_DATA([in.m4], [[define(d,divnum)divert(0)d
 ]])
-M4_DIVNUM_TEST([2147483647])
-M4_DIVNUM_TEST([2147483648], [:])
+M4_DIVNUM_TEST([02147483647])
+M4_DIVNUM_TEST([02147483648], [:])
 M4_DIVNUM_TEST([-2147483648])
 M4_DIVNUM_TEST([-2147483649], [:])
 
-- 
1.5.5.1


>From 9f8edd91d4dcab4e1735f02253c32a55c62f2aac Mon Sep 17 00:00:00 2001
From: Eric Blake <address@hidden>
Date: Tue, 13 May 2008 09:30:04 -0600
Subject: [PATCH] Fix frozen file regression in diversions from 2007-01-21.

* m4/output.c (insert_diversion_helper): Add parameter.
(m4_insert_file): Move contents...
(insert_file): ...to this new helper, with added parameter.
(m4_insert_diversion, m4_undivert_all, m4_freeze_diversions):
Update callers.
* src/freeze.c (produce_mem_dump): Simplify.
(produce_syntax_dump, produce_module_dump): Add parameter.
* tests/freeze.at (large diversion): Test for this.

Signed-off-by: Eric Blake <address@hidden>
---
 ChangeLog       |   10 ++++++++++
 m4/output.c     |   51 ++++++++++++++++++++++++++++++++++-----------------
 src/freeze.c    |   32 ++++----------------------------
 tests/freeze.at |    3 ++-
 4 files changed, 50 insertions(+), 46 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 4bee882..18e5d0a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,15 @@
 2008-05-13  Eric Blake  <address@hidden>
 
+       Fix frozen file regression in diversions from 2007-01-21.
+       * m4/output.c (insert_diversion_helper): Add parameter.
+       (m4_insert_file): Move contents...
+       (insert_file): ...to this new helper, with added parameter.
+       (m4_insert_diversion, m4_undivert_all, m4_freeze_diversions):
+       Update callers.
+       * src/freeze.c (produce_mem_dump): Simplify.
+       (produce_syntax_dump, produce_module_dump): Add parameter.
+       * tests/freeze.at (large diversion): Test for this.
+
        Improve error message when frozen file is invalid.
        * src/freeze.c (decode_char): Add parameter.  Allow \<newline>
        line continuations.
diff --git a/m4/output.c b/m4/output.c
index c903d99..d3c5507 100644
--- a/m4/output.c
+++ b/m4/output.c
@@ -27,6 +27,7 @@
 #include "exitfail.h"
 #include "gl_avltree_oset.h"
 #include "intprops.h"
+#include "quotearg.h"
 #include "xvasprintf.h"
 
 /* Define this to see runtime debug output.  Implied by DEBUG.  */
@@ -740,19 +741,16 @@ m4_make_diversion (m4 *context, int divnum)
 }
 
 /* Insert a FILE into the current output file, in the same manner
-   diversions are handled.  This allows files to be included, without
-   having them rescanned by m4.  */
-void
-m4_insert_file (m4 *context, FILE *file)
+   diversions are handled.  If ESCAPED, ensure the output is all
+   ASCII.  */
+static void
+insert_file (m4 *context, FILE *file, bool escaped)
 {
   char buffer[COPY_BUFFER_SIZE];
   size_t length;
+  char *str = buffer;
 
-  /* Optimize out inserting into a sink.  */
-
-  if (!output_diversion)
-    return;
-
+  assert (output_diversion);
   /* Insert output by big chunks.  */
   for (;;)
     {
@@ -762,16 +760,29 @@ m4_insert_file (m4 *context, FILE *file)
                  _("reading inserted file"));
       if (length == 0)
        break;
-      m4_output_text (context, buffer, length);
+      if (escaped)
+       str = quotearg_style_mem (escape_quoting_style, buffer, length);
+      m4_output_text (context, str, escaped ? strlen (str) : length);
     }
 }
 
+/* Insert a FILE into the current output file, in the same manner
+   diversions are handled.  This allows files to be included, without
+   having them rescanned by m4.  */
+void
+m4_insert_file (m4 *context, FILE *file)
+{
+  /* Optimize out inserting into a sink.  */
+  if (output_diversion)
+    insert_file (context, file, false);
+}
+
 /* Insert DIVERSION living at NODE into the current output file.  The
    diversion is NOT placed on the expansion obstack, because it must
-   not be rescanned.  When the file is closed, it is deleted by the
-   system.  */
+   not be rescanned.  If ESCAPED, ensure the output is ASCII.  When
+   the file is closed, it is deleted by the system.  */
 static void
-insert_diversion_helper (m4 *context, m4_diversion *diversion)
+insert_diversion_helper (m4 *context, m4_diversion *diversion, bool escaped)
 {
   assert (diversion->divnum > 0
          && diversion->divnum != m4_get_current_diversion (context));
@@ -779,7 +790,13 @@ insert_diversion_helper (m4 *context, m4_diversion 
*diversion)
   if (output_diversion)
     {
       if (diversion->size)
-       m4_output_text (context, diversion->u.buffer, diversion->used);
+       {
+         char *str = diversion->u.buffer;
+         size_t len = diversion->used;
+         if (escaped)
+           str = quotearg_style_mem (escape_quoting_style, str, len);
+         m4_output_text (context, str, escaped ? strlen (str) : len);
+       }
       else
        {
          assert (diversion->used);
@@ -836,7 +853,7 @@ m4_insert_diversion (m4 *context, int divnum)
     {
       m4_diversion *diversion = (m4_diversion *) elt;
       if (diversion->divnum == divnum)
-       insert_diversion_helper (context, diversion);
+       insert_diversion_helper (context, diversion, false);
     }
 }
 
@@ -852,7 +869,7 @@ m4_undivert_all (m4 *context)
     {
       m4_diversion *diversion = (m4_diversion *) elt;
       if (diversion->divnum != divnum)
-       insert_diversion_helper (context, diversion);
+       insert_diversion_helper (context, diversion, false);
     }
   gl_oset_iterator_free (&iter);
 }
@@ -902,7 +919,7 @@ m4_freeze_diversions (m4 *context, FILE *file)
                        (unsigned long int) file_stat.st_size);
            }
 
-         insert_diversion_helper (context, diversion);
+         insert_diversion_helper (context, diversion, true);
          putc ('\n', file);
 
          last_inserted = diversion->divnum;
diff --git a/src/freeze.c b/src/freeze.c
index 897a246..0b48ac6 100644
--- a/src/freeze.c
+++ b/src/freeze.c
@@ -25,6 +25,7 @@
 #include "m4.h"
 
 #include "binary-io.h"
+#include "quotearg.h"
 
 static void  produce_mem_dump          (FILE *, const char *, size_t);
 static void  produce_resyntax_dump     (m4 *, FILE *);
@@ -42,34 +43,9 @@ static       int   decode_char               (m4 *, FILE *, 
bool *);
 static void
 produce_mem_dump (FILE *file, const char *mem, size_t len)
 {
-  while (len--)
-    {
-      int ch = to_uchar (*mem++);
-      switch (ch)
-       {
-       case '\a': putc ('\\', file); putc ('a', file); break;
-       case '\b': putc ('\\', file); putc ('b', file); break;
-       case '\f': putc ('\\', file); putc ('f', file); break;
-       case '\n': putc ('\\', file); putc ('n', file); break;
-       case '\r': putc ('\\', file); putc ('r', file); break;
-       case '\t': putc ('\\', file); putc ('t', file); break;
-       case '\v': putc ('\\', file); putc ('v', file); break;
-       case '\\': putc ('\\', file); putc ('\\', file); break;
-       default:
-         if (ch >= 0x7f || ch < 0x20)
-           {
-             int digit = ch / 16;
-             ch %= 16;
-             digit += digit > 9 ? 'a' - 10 : '0';
-             ch += ch > 9 ? 'a' - 10 : '0';
-             putc ('\\', file);
-             putc ('x', file);
-             putc (digit, file);
-           }
-         putc (ch, file);
-         break;
-       }
-    }
+  char *quoted = quotearg_style_mem (escape_quoting_style, mem, len);
+  /* Any errors will be detected by ferror later.  */
+  fwrite (quoted, strlen (quoted), 1, file);
 }
 
 
diff --git a/tests/freeze.at b/tests/freeze.at
index e43af6c..74dc3e9 100644
--- a/tests/freeze.at
+++ b/tests/freeze.at
@@ -27,9 +27,10 @@ AT_SETUP([large diversion])
 AT_KEYWORDS([frozen])
 
 # Check that large diversions are handled across freeze boundaries.
-
+# Also check for escape character handling.
 AT_DATA([[frozen.m4]], [M4_ONE_MEG_DEFN[divert(2)f
 divert(1)hi
+a\nb
 ]])
 
 AT_DATA([[unfrozen.m4]],
-- 
1.5.5.1








reply via email to

[Prev in Thread] Current Thread [Next in Thread]