[SCM] GNU M4 source repository branch, master, updated. cvs-readonly-198

m4-commit
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[SCM] GNU M4 source repository branch, master, updated. cvs-readonly-198

From:	Eric Blake
Subject:	[SCM] GNU M4 source repository branch, master, updated. cvs-readonly-198-g047d480
Date:	Tue, 17 Feb 2009 13:27:55 +0000
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU M4 source repository".

http://git.sv.gnu.org/gitweb/?p=m4.git;a=commitdiff;h=047d480cdc9ff71e4e3228017ca24a83737cbf1f

The branch, master has been updated
       via  047d480cdc9ff71e4e3228017ca24a83737cbf1f (commit)
       via  0e14ae3e78f06cefeabb61ca23ddbdf00afc2a00 (commit)
       via  267f56e7699f2e506cc977fc4c96b4dea6626fd4 (commit)
       via  e5632a42071a39b1e6988533aeb2aeab16188b85 (commit)
       via  1e2cb352077020f928c9e6c700880276ea79d729 (commit)
      from  a2cdd6be73989df7e62caa8bfc55327fee3c9fac (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
commit 047d480cdc9ff71e4e3228017ca24a83737cbf1f
Author: Eric Blake <address@hidden>
Date:   Mon Feb 16 08:52:48 2009 -0700

    Stage 29b: Process quotes and comments by buffer, not bytes.
    
    * ltdl/m4/gnulib-cache.m4: Import memchr2 module.
    * m4/input.c (m4__next_token): Add buffer reads to quote and
    comment parsing.
    * NEWS: Document this.
    
    Signed-off-by: Eric Blake <address@hidden>

commit 0e14ae3e78f06cefeabb61ca23ddbdf00afc2a00
Author: Eric Blake <address@hidden>
Date:   Fri Feb 13 07:10:36 2009 -0700

    Stage 29a: Process dnl and macro names by buffer, not bytes.
    
    * ltdl/m4/gnulib-cache.m4: Import freadptr and freadseek modules.
    * m4/input.c (struct input_funcs): Add virtual functions
    buffer_func and consume_func.
    (file_buffer, file_consume, string_buffer, string_consume)
    (composite_buffer, composite_consume, eof_buffer): Implement
    them.
    (file_funcs, string_funcs, composite_funcs, eof_funcs): Update
    vtables accordingly.
    (buffer_retry): New sentinel.
    (next_buffer, consume_buffer): New functions.
    (m4_skip_line, match_input, consume_syntax): Use them for faster
    parsing.
    Suggested by Bruno Haible.
    
    Signed-off-by: Eric Blake <address@hidden>

commit 267f56e7699f2e506cc977fc4c96b4dea6626fd4
Author: Eric Blake <address@hidden>
Date:   Mon Feb 16 07:02:03 2009 -0700

    Unify single and multi-character delimiter handling.
    
    * m4/input.c (MATCH): Add a parameter.
    (m4__next_token): Simplify logic and reduce redundancy.
    (m4__next_token_is_open): Adjust caller.
    * m4/syntax.c (m4_set_comment, m4_set_quotes): Handle delimiters
    of differing lengths.
    (m4_set_syntax): Recognize restoration of single delimiters.
    
    Signed-off-by: Eric Blake <address@hidden>

commit e5632a42071a39b1e6988533aeb2aeab16188b85
Author: Eric Blake <address@hidden>
Date:   Sat Feb 14 10:14:34 2009 -0700

    Revamp changesyntax vs. changequote interactions.
    
    * m4/m4module.h (M4_SYNTAX_VALUE): Delete unused macro.
    (M4_SYNTAX_SUSPECT): New macro.
    * m4/m4private.h (struct m4_syntax_table): Add suspect field.
    * m4/syntax.c (check_is_single_quotes, check_is_single_comments)
    (check_is_macro_escaped): Delete, by inlining body...
    (m4_set_syntax): ...into here.  Improves handling between
    changesyntax and changequote/changecom.
    (add_syntax_set, subtract_syntax_set, set_syntax_set): Simplify,
    and let suspect field track needed cleanup.
    (m4_set_quotes, m4_set_comment): Adjust meaning of
    is_single_quotes and is_single_comment flags to always be true if
    only one delimiter exists, regardless of its length.  Ensure that
    the syntax categories M4_SYNTAX_LQUOTE and M4_SYNTAX_BCOMM are
    only used on 1-byte delimiters.
    (add_syntax_attribute, remove_syntax_attribute): Change signature
    to allow the use of fewer casts.  Adjust the suspect field when
    necessary.
    (m4_reset_syntax, set_quote_age): Adjust callers.
    * m4/input.c (m4__next_token, m4__next_token_is_open): Simplify
    callers.
    * doc/m4.texinfo (Changesyntax): Update documentation and tests.
    
    Signed-off-by: Eric Blake <address@hidden>

commit 1e2cb352077020f928c9e6c700880276ea79d729
Author: Eric Blake <address@hidden>
Date:   Sat Feb 14 06:58:08 2009 -0700

    Improve changesyntax documentation.
    
    * doc/m4.texinfo (Changesyntax): Merge two tables into one
    multitable.
    
    Signed-off-by: Eric Blake <address@hidden>

-----------------------------------------------------------------------

Summary of changes:
 ChangeLog               |   66 ++++++
 NEWS                    |   13 +-
 doc/m4.texinfo          |  349 ++++++++++++++--------------
 ltdl/m4/gnulib-cache.m4 |    5 +-
 m4/input.c              |  593 ++++++++++++++++++++++++++++++++++-------------
 m4/m4module.h           |    6 +-
 m4/m4private.h          |   12 +-
 m4/syntax.c             |  445 +++++++++++++++++------------------
 8 files changed, 905 insertions(+), 584 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index 90957fd..ad5e8a4 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,5 +1,71 @@
+2009-02-17  Eric Blake  <address@hidden>
+
+       Stage 29b: Process quotes and comments by buffer, not bytes.
+       Search for quote and comment delimiters by buffer when possible.
+       Memory impact: none.
+       Speed impact: noticeable improvement, from fewer function calls.
+       * ltdl/m4/gnulib-cache.m4: Import memchr2 module.
+       * m4/input.c (m4__next_token): Add buffer reads to quote and
+       comment parsing.
+       * NEWS: Document this.
+
 2009-02-16  Eric Blake  <address@hidden>
 
+       Stage 29a: Process dnl and macro names by buffer, not bytes.
+       Enhance input engine to provide lookahead buffer, rather than
+       forcing clients to call next_char for every byte.  Utilize this
+       for the simplest clients.
+       Memory impact: none.
+       Speed impact: noticeable improvement, from fewer function calls.
+       * ltdl/m4/gnulib-cache.m4: Import freadptr and freadseek modules.
+       * m4/input.c (struct input_funcs): Add virtual functions
+       buffer_func and consume_func.
+       (file_buffer, file_consume, string_buffer, string_consume)
+       (composite_buffer, composite_consume, eof_buffer): Implement
+       them.
+       (file_funcs, string_funcs, composite_funcs, eof_funcs): Update
+       vtables accordingly.
+       (buffer_retry): New sentinel.
+       (next_buffer, consume_buffer): New functions.
+       (m4_skip_line, match_input, consume_syntax): Use them for faster
+       parsing.
+       Suggested by Bruno Haible.
+
+       Unify single and multi-character delimiter handling.
+       * m4/input.c (MATCH): Add a parameter.
+       (m4__next_token): Simplify logic and reduce redundancy.
+       (m4__next_token_is_open): Adjust caller.
+       * m4/syntax.c (m4_set_comment, m4_set_quotes): Handle delimiters
+       of differing lengths.
+       (m4_set_syntax): Recognize restoration of single delimiters.
+
+       Revamp changesyntax vs. changequote interactions.
+       * m4/m4module.h (M4_SYNTAX_VALUE): Delete unused macro.
+       (M4_SYNTAX_SUSPECT): New macro.
+       * m4/m4private.h (struct m4_syntax_table): Add suspect field.
+       * m4/syntax.c (check_is_single_quotes, check_is_single_comments)
+       (check_is_macro_escaped): Delete, by inlining body...
+       (m4_set_syntax): ...into here.  Improves handling between
+       changesyntax and changequote/changecom.
+       (add_syntax_set, subtract_syntax_set, set_syntax_set): Simplify,
+       and let suspect field track needed cleanup.
+       (m4_set_quotes, m4_set_comment): Adjust meaning of
+       is_single_quotes and is_single_comment flags to always be true if
+       only one delimiter exists, regardless of its length.  Ensure that
+       the syntax categories M4_SYNTAX_LQUOTE and M4_SYNTAX_BCOMM are
+       only used on 1-byte delimiters.
+       (add_syntax_attribute, remove_syntax_attribute): Change signature
+       to allow the use of fewer casts.  Adjust the suspect field when
+       necessary.
+       (m4_reset_syntax, set_quote_age): Adjust callers.
+       * m4/input.c (m4__next_token, m4__next_token_is_open): Simplify
+       callers.
+       * doc/m4.texinfo (Changesyntax): Update documentation and tests.
+
+       Improve changesyntax documentation.
+       * doc/m4.texinfo (Changesyntax): Merge two tables into one
+       multitable.
+
        Fix regression in multicharacter quotes, from 2008-01-26.
        * m4/input.c (m4__next_token): Fix typo.
        * tests/builtins.at (changequote): Enhance test.
diff --git a/NEWS b/NEWS
index 772216d..1f25484 100644
--- a/NEWS
+++ b/NEWS
@@ -42,11 +42,6 @@ promoted to 2.0.
 *** The `-L'/`--nesting-limit' command-line option now performs argument
     validation and accepts an optional multiplier suffix.
 
-*** The `-o'/`--error-output' command-line options, which were replaced by
-    `--debugfile' in M4 1.4.7, now issue a deprecation warning.  This
-    warning interferes with all versions of Autoconf prior to 2.61, so plan
-    on installing an updated Autoconf when installing M4 2.0.
-
 *** New `-p'/`--pushdef' and `--popdef' command-line options allow more
     control over macro definitions from the command line between input
     files.
@@ -217,6 +212,14 @@ promoted to 2.0.
 ** Remove the undocumented command-line option '-N', as no one complained
    about the assertion failure regression that it introduced in 1.4.7.
 
+** The `-o'/`--error-output' command-line options, which were replaced by
+   `--debugfile' in 1.4.7, now issue a deprecation warning.  This warning
+   harmlessly triggers with versions of Autoconf 2.60 and earlier, but can
+   be silenced by applying this patch:
+     http://git.sv.gnu.org/gitweb/?p=autoconf.git;a=commitdiff;h=714eeee87
+
+** Improve the speed of the input engine.
+
 ** Fix the `m4wrap' builtin to accumulate wrapped text in FIFO order, as
    required by POSIX.  The manual mentions a way to restore the LIFO order
    present in earlier GNU M4 versions.  NOTE: this change exposes a bug
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index e574bd5..5c09838 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -5401,71 +5401,125 @@ Each token is parsed according to certain rules.  For 
example, a macro
 name starts with a letter or @samp{_} and consists of the longest
 possible string of letters, @samp{_} and digits.  But who is to decide
 what characters are letters, digits, quotes, white space?  Earlier the
-operating system decided, now you do.
+operating system decided, now you do.  The builtin macro
address@hidden is used to change the way @code{m4} parses the input
+stream into tokens.
 
-Input characters belong to different categories:
address@hidden {Builtin (gnu)} changesyntax (@var{syntax-spec}, @dots{})
+Each @var{syntax-spec} is a two-part string.  The first part is a
+command, consisting of a single character describing a syntax category,
+and an optional one-character action.  The action can be @samp{-} to
+remove the listed characters from that category and reassign them to the
+`Other' category, @samp{=} to set the category to the listed characters
+and reassign all other characters previously in that category to
+`Other', or @samp{+} to add the listed characters to the category
+without affecting other characters.  If an action is not specified, but
+additional characters are present, then @samp{=} is assumed.
 
address@hidden @dfn
address@hidden Letters
-Characters that start a macro name.  Defaults to the letters as defined
-by the locale, and the character @samp{_}.
+The remaining characters of each @var{syntax-spec} form the set of
+characters to perform the action on for that syntax category.  Character
+ranges are expanded as for @code{translit} (@pxref{Translit}).  To start
+the character set with @samp{-}, @samp{+}, or @samp{=}, an action must
+be specified.
+
+If @var{syntax-spec} is just a category, and no action or characters
+were specified, then all characters in that category are reset to their
+default state.  A warning is issued if the category character is not
+valid.  If @var{syntax-spec} is the empty string, then all categories
+are reset to their default state.
 
address@hidden Digits
-Characters that, together with the letters, form the remainder of a
+Syntax categories are divided into basic and context.  Every input
+byte belongs to exactly one basic syntax category.  Additionally, any
+byte can be assigned to a context category regardless of its current
+basic category.  Context categories exist because a character can
+behave differently when parsed in isolation than when it occurs in
+context to close out a token started by another basic category (for
+example, @kbd{newline} defaults to the basic category `Whitespace' as
+well as the context category `End comment').
+
+The following table describes the case-insensitive designation for each
+syntax category (the first byte in @var{syntax-spec}), and a description
+of what each category controls.
+
address@hidden @columnfractions .06 .20 .13 .55
address@hidden Code @tab Category @tab Type @tab Description
+
address@hidden @kbd{W} @tab @dfn{Words} @tab Basic
address@hidden Characters that can start a macro name.  Defaults to the letters 
as
+defined by the locale, and the character @samp{_}.
+
address@hidden @kbd{D} @tab @dfn{Digits} @tab Basic
address@hidden Characters that, together with the letters, form the remainder 
of a
 macro name.  Defaults to the ten digits @address@hidden@samp{9}, and any
 other digits defined by the locale.
 
address@hidden White space
-Characters that should be trimmed from the beginning of each argument to
address@hidden @kbd{S} @tab @dfn{White space} @tab Basic
address@hidden Characters that should be trimmed from the beginning of each 
argument to
 a macro call.  The defaults are space, tab, newline, carriage return,
 form feed, and vertical tab, and any others as defined by the locale.
 
address@hidden Open parenthesis
-Characters that open the argument list of a macro call.  The default is
address@hidden @kbd{(} @tab @dfn{Open parenthesis} @tab Basic
address@hidden Characters that open the argument list of a macro call.  The 
default is
 the single character @samp{(}.
 
address@hidden Close parenthesis
-Characters that close the argument list of a macro call.  The default
address@hidden @kbd{)} @tab @dfn{Close parenthesis} @tab Basic
address@hidden Characters that close the argument list of a macro call.  The 
default
 is the single character @samp{)}.
 
address@hidden Argument separator
-Characters that separate the arguments of a macro call.  The default is
address@hidden @kbd{,} @tab @dfn{Argument separator} @tab Basic
address@hidden Characters that separate the arguments of a macro call.  The 
default is
 the single character @samp{,}.
 
address@hidden Dollar
-Characters that can introduce an argument reference in the body of a
address@hidden @kbd{L} @tab @dfn{Left quote} @tab Basic
address@hidden The set of characters that can start a single-character quoted 
string.
+The default is the single character @samp{`}.  For multiple-character
+quote delimiters, use @code{changequote} (@pxref{Changequote}).
+
address@hidden @kbd{R} @tab @dfn{Right quote} @tab Context
address@hidden The set of characters that can end a single-character quoted 
string.
+The default is the single character @samp{'}.  For multiple-character
+quote delimiters, use @code{changequote} (@pxref{Changequote}).  Note
+that @samp{'} also defaults to the syntax category `Other', when it
+appears in isolation.
+
address@hidden @kbd{B} @tab @dfn{Begin comment} @tab Basic
address@hidden The set of characters that can start a single-character comment. 
 The
+default is the single character @samp{#}.  For multiple-character
+comment delimiters, use @code{changecom} (@pxref{Changecom}).
+
address@hidden @kbd{E} @tab @dfn{End comment} @tab Context
address@hidden The set of characters that can end a single-character comment.  
The
+default is the single character @kbd{newline}.  For multiple-character
+comment delimiters, use @code{changecom} (@pxref{Changecom}).  Note that
+newline also defaults to the syntax category `White space', when it
+appears in isolation.
+
address@hidden FIXME - make ${} context, not basic
address@hidden @kbd{$} @tab @dfn{Dollar} @tab Basic
address@hidden Characters that can introduce an argument reference in the body 
of a
 macro.  The default is the single character @samp{$}.
 
address@hidden Left brace
-Characters that introduce an extended argument reference in the body of
address@hidden FIXME - implement ${10} argument parsing.
address@hidden @address@hidden @tab @dfn{Left brace} @tab Basic
address@hidden Characters that introduce an extended argument reference in the 
body of
 a macro immediately after a character in the Dollar category.  The
 default is the single character @address@hidden
 
address@hidden Right brace
-Characters that conclude an extended argument reference in the body of a
address@hidden @address@hidden @tab @dfn{Right brace} @tab Basic
address@hidden Characters that conclude an extended argument reference in the 
body of a
 macro.  The default is the single character @address@hidden
 
address@hidden Left quote
-The set of characters that can start a single-character quoted string.
-The default is the single character @samp{`}.  For multiple-character
-quote delimiters, use @code{changequote} (@pxref{Changequote}).
-
address@hidden Begin comment
-The set of characters that can start a single-character comment.  The
-default is the single character @samp{#}.  For multiple-character
-comment delimiters, use @code{changecom} (@pxref{Changecom}).
-
address@hidden Other
-Characters that have no special syntactical meaning to @code{m4}.
address@hidden @kbd{O} @tab @dfn{Other} @tab Basic
address@hidden Characters that have no special syntactical meaning to @code{m4}.
 Defaults to all characters except those in the categories above.
 
address@hidden Active
-Characters that themselves, alone, form macro names.  This is a
address@hidden @kbd{A} @tab @dfn{Active} @tab Basic
address@hidden Characters that themselves, alone, form macro names.  This is a
 @acronym{GNU} extension, and active characters have lower precedence
 than comments.  By default, no characters are active.
 
address@hidden Escape
-Characters that must precede macro names for them to be recognized.
address@hidden @kbd{@@} @tab @dfn{Escape} @tab Basic
address@hidden Characters that must precede macro names for them to be 
recognized.
 This is a @acronym{GNU} extension.  When an escape character is defined,
 then macros are not recognized unless the escape character is present;
 however, the macro name, visible by @samp{$0} in macro definitions, does
@@ -5473,97 +5527,10 @@ not include the escape character.  By default, no 
characters are
 escapes.
 
 @comment FIXME - we should also consider supporting:
address@hidden @item Ignore - characters that are ignored if they appear in
address@hidden the input; perhaps defaulting to '\0', category 'I'.
address@hidden table
-
address@hidden
-Each character can, besides the basic syntax category, have some syntax
-attributes.  One reason these are attributes rather than categories is
-that end delimiters are never recognized except when searching for the
-end of a token triggered by a start delimiter; the end delimiter can
-have syntax properties of its own when it appears in isolation.  These
-attributes are:
-
address@hidden @dfn
address@hidden Right quote
-The set of characters that can end a single-character quoted string.
-The default is the single character @samp{'}.  For multiple-character
-quote delimiters, use @code{changequote} (@pxref{Changequote}).  Note
-that @samp{'} also defaults to the syntax category `Other', when it
-appears in isolation.
-
address@hidden End comment
-The set of characters that can end a single-character comment.  The
-default is the single character @kbd{newline}.  For multiple-character
-comment delimiters, use @code{changecom} (@pxref{Changecom}).  Note that
-newline also defaults to the syntax category `White space', when it
-appears in isolation.
address@hidden table
-
-The builtin macro @code{changesyntax} is used to change the way
address@hidden parses the input stream into tokens.
-
address@hidden {Builtin (gnu)} changesyntax (@var{syntax-spec}, @dots{})
-Each @var{syntax-spec} is a two-part string.  The first part is a
-command, consisting of a single character describing a syntax category,
-and an optional one-character action.  The action can be @samp{-} to
-remove the listed characters from that category and reassign them to the
-`Other' category, @samp{=} to set the category to the listed characters
-and reassign all other characters previously in that category to
-`Other', or @samp{+} to add the listed characters to the category
-without affecting other characters.  If an action is not specified, but
-additional characters are present, then @samp{=} is assumed.  The
-case-insensitive characters for the syntax categories are:
-
address@hidden @kbd
address@hidden W
-Letters
address@hidden D
-Digits
address@hidden S
-White space
address@hidden (
-Open parenthesis
address@hidden )
-Close parenthesis
address@hidden ,
-Argument separator
address@hidden $
-Dollar
address@hidden @{
-Left brace
address@hidden @}
-Right brace
address@hidden O
-Other
address@hidden @@
-Escape
address@hidden A
-Active
address@hidden L
-Left quote
address@hidden R
-Right quote
address@hidden B
-Begin comment
address@hidden E
-End comment
address@hidden @item I
address@hidden Ignore
address@hidden table
-
-The remaining characters of each @var{syntax-spec} form the set of
-characters to perform the action on for that syntax category.  Character
-ranges are expanded as for @code{translit} (@pxref{Translit}).  To start
-the character set with @samp{-}, @samp{+}, or @samp{=}, an action must
-be specified.
-
-If @var{syntax-spec} is just a category, and no action or characters
-were specified, then all characters in that category are reset to their
-default state.  A warning is issued if the category character is not
-valid.  If @var{syntax-spec} is the empty string, then all categories
-are reset to their default state.
address@hidden @item @kbd{I} @tab @dfn{Ignore} @tab Basic
address@hidden @tab Characters that are ignored if they appear in
address@hidden the input; perhaps defaulting to '\0'.
address@hidden multitable
 
 The expansion of @code{changesyntax} is void.
 The macro @code{changesyntax} is recognized only with parameters.  Use
@@ -5572,7 +5539,9 @@ a way that no further macros can be recognized by 
@code{m4}.
 This macro was added in M4 2.0.
 @end deffn
 
-With @code{changesyntax} we can modify what characters form a word.
+With @code{changesyntax} we can modify what characters form a word.  For
+example, we can make @samp{.} a valid character in a macro name, or even
+start a macro name with a number.
 
 @example
 define(`test.1', `TEST ONE')
@@ -5583,18 +5552,21 @@ __file__
 @result{}stdin
 test.1
 @result{}test.1
+dnl Add `.' and remove `_'.
 changesyntax(`W+.', `W-_')
 @result{}
 __file__
 @result{}__file__
 test.1
 @result{}TEST ONE
+dnl Set words to include numbers.
 changesyntax(`W=a-zA-Z0-9_')
 @result{}
 __file__
 @result{}stdin
 test.1
 @result{}test.one
+dnl Reset words to default (a-zA-Z_).
 changesyntax(`W')
 @result{}
 __file__
@@ -5610,6 +5582,7 @@ define(`test', `$#')
 @result{}
 test(a, b, c)
 @result{}3
+dnl Change macro syntax.
 changesyntax(`(<', `,|', `)>')
 @result{}
 test(a, b, c)
@@ -5627,10 +5600,14 @@ define(`test', `$1$2$3')
 @result{}
 test(`a', `b', `c')
 @result{}abc
-changesyntax(`O 'format(`%c', `9'))
+dnl Don't ignore whitespace.
+changesyntax(`O 'format(``%c'', `9')`
+')
 @result{}
-test(a, b, c)
address@hidden b c
+test(a, b,
+c)
address@hidden b
address@hidden
 @end example
 
 It is possible to redefine the @samp{$} used to indicate macro arguments
@@ -5641,6 +5618,7 @@ define(`argref', `Dollar: $#, Question: ?#')
 @result{}
 argref(1, 2, 3)
 @result{}Dollar: 3, Question: ?#
+dnl Change argument identifier.
 changesyntax(`$?', `O$')
 @result{}
 argref(1,2,3)
@@ -5654,6 +5632,7 @@ valid expansion.
 @example
 define(`escape', `$?`'1$?1?')
 @result{}
+dnl Change argument identifier.
 changesyntax(`$?')
 @result{}
 escape(foo)
@@ -5674,6 +5653,7 @@ They and the escape character are simply output.
 @example
 define(`foo', `bar')
 @result{}
+dnl Require @@ escape before any macro.
 changesyntax(`@@@@')
 @result{}
 foo
@@ -5682,6 +5662,7 @@ foo
 @result{}bar
 @@bar
 @result{}@@bar
+@@dnl Change escape character.
 @@changesyntax(`@@\', `O@@')
 @result{}
 foo
@@ -5705,43 +5686,75 @@ definition, the macro will be called.
 @example
 define(`@@', `TEST')
 @result{}
+define(`a@@a', `hello')
address@hidden
+define(`a', `A')
address@hidden
 @@
 @result{}@@
+a@@a
address@hidden@@A
+dnl Make @@ active.
 changesyntax(`A@@')
 @result{}
 @@
 @result{}TEST
address@hidden example
-
-There is obviously an overlap with @code{changecom} and
address@hidden  Comment delimiters and quotes can now be defined in
-two different ways.  To avoid incompatibilities, if the quotes are set
-with @code{changequote}, all other characters marked in the syntax table
-as quotes will revert to their normal syntax categories, leaving only
-one set of defined quotes as before.  If the quotes are set with
address@hidden, it is possible to result in multiple sets of
-quotes.  This applies to comment delimiters as well, @emph{mutatis
-mutandis}.
+a@@a
address@hidden
address@hidden example
+
+There is obviously an overlap between @code{changesyntax} and
address@hidden, since there are now two ways to modify quote
+delimiters.  To avoid incompatibilities, if the quotes are modified by
address@hidden, any characters previously set to either quote
+delimiter by @code{changesyntax} are first demoted to the other category
+(@samp{O}), so the result is only a single set of quotes.  In the other
+direction, if quotes were already disabled, or if both the start and end
+delimiter set by @code{changequote} are single bytes, then
address@hidden preserves those settings.  But if either delimiter
+occupies multiple bytes, @code{changesyntax} first disables both
+delimiters.  Quotes can be disabled via @code{changesyntax} by emptying
+the left quote basic category (@samp{L}).  Meanwhile, the right quote
+context category (@samp{R}) will never be empty; if a
address@hidden action would otherwise leave that category empty,
+then the default end delimiter from @code{changequote} (@samp{'}) is
+used; thus, it is never possible to get @code{m4} in a state where a
+quoted string cannot be terminated.  These interactions apply to comment
+delimiters as well, @i{mutatis mutandis} with @code{changecom}.
 
 @example
 define(`test', `TEST')
 @result{}
+dnl Add additional single-byte delimiters.
 changesyntax(`L+<', `R+>')
 @result{}
-<test>
address@hidden
-`test'
address@hidden
-[test]
address@hidden
+<test> `test' [test] <<test>>
address@hidden test [TEST] <test>
+dnl Use standard interface, overriding changesyntax settings.
 changequote(<[>, `]')
 @result{}
-<test>
address@hidden<TEST>
-`test'
address@hidden'
-[test]
address@hidden
+<test> `test' [test] <<test>>
address@hidden<TEST> `TEST' test <<TEST>>
+dnl Introduce multi-byte delimiters.
+changequote([<<], [>>])
address@hidden
+<test> `test' [test] <<test>>
address@hidden<TEST> `TEST' [TEST] test
+dnl Change end quote, effectively disabling quotes.
+changesyntax(<<R]>>)
address@hidden
+<test> `test' [test] <<test>>
address@hidden<TEST> `TEST' [TEST] <<TEST>>
+dnl Change beginning quote, make ] normal, thus making ' end quote.
+changesyntax(L`, R-])
address@hidden
+<test> `test' [test] <<test>>
address@hidden<TEST> test [TEST] <<TEST>>
+dnl Set multi-byte quote; unrelated changes don't impact it.
+changequote(`<<', `>>')changesyntax(<<@@\>>)
address@hidden
+<\test> `\test' [\test] <<\test>>
address@hidden<TEST> `TEST' [TEST] \test
 @end example
 
 If several characters are assigned to a category that forms single
@@ -5749,35 +5762,13 @@ character tokens, all such characters are treated as 
equal.  Any open
 parenthesis will match any close parenthesis, etc.
 
 @example
+dnl Go crazy with symbols.
 changesyntax(`(@{<', `)@}>', `,;:', `O(,)')
 @result{}
 address@hidden; 2: 8>
 @result{}00001111
 @end example
 
-On the other hand, a multi-character start-quote sequence, which can
-only be created by @code{changequote}, will only be matched by the
-corresponding end-quote sequence.  The same goes for comment delimiters.
-
address@hidden
-define(`test', `==$1==')
address@hidden
-changequote(`<<', `>>')
address@hidden
-changesyntax(<<L[>>, <<R]>>)
address@hidden
-test(<<testing]>>)
address@hidden
-test([testing>>])
address@hidden>>==
-test([<<testing>>])
address@hidden
address@hidden example
-
address@hidden
-Note how it is possible to have both long and short quotes, if
address@hidden is used before @code{changesyntax}.
-
 The syntax table is initialized to be backwards compatible, so if you
 never call @code{changesyntax}, nothing will have changed.
 
diff --git a/ltdl/m4/gnulib-cache.m4 b/ltdl/m4/gnulib-cache.m4
index fbd030e..f8436dc 100644
--- a/ltdl/m4/gnulib-cache.m4
+++ b/ltdl/m4/gnulib-cache.m4
@@ -15,7 +15,7 @@
 
 
 # Specification in the form of a command-line invocation:
-#   gnulib-tool --import --dir=. --local-dir=local --lib=libgnu 
--source-base=gnu --m4-base=ltdl/m4 --doc-base=doc --tests-base=tests/gnu 
--aux-dir=build-aux --with-tests --libtool --macro-prefix=M4 assert autobuild 
avltree-oset binary-io clean-temp cloexec close-stream closein config-h 
configmake dirname error exit fdl-1.3 fflush filenamecat flexmember fopen 
fopen-safer fseeko gendocs gettext git-version-gen gnumakefile gnupload gpl-3.0 
intprops memmem mkstemp obstack obstack-printf-posix progname propername quote 
regex regexprops-generic sprintf-posix stdbool stdlib-safer strnlen strtod 
strtol tempname unlocked-io vasnprintf-posix verify verror xalloc xalloc-die 
xmemdup0 xprintf-posix xstrndup xvasprintf-posix
+#   gnulib-tool --import --dir=. --local-dir=local --lib=libgnu 
--source-base=gnu --m4-base=ltdl/m4 --doc-base=doc --tests-base=tests/gnu 
--aux-dir=build-aux --with-tests --libtool --macro-prefix=M4 assert autobuild 
avltree-oset binary-io clean-temp cloexec close-stream closein config-h 
configmake dirname error exit fdl-1.3 fflush filenamecat flexmember fopen 
fopen-safer freadptr freadseek fseeko gendocs gettext git-version-gen 
gnumakefile gnupload gpl-3.0 intprops memchr2 memmem mkstemp obstack 
obstack-printf-posix progname propername quote regex regexprops-generic 
sprintf-posix stdbool stdlib-safer strnlen strtod strtol tempname unlocked-io 
vasnprintf-posix verify verror xalloc xalloc-die xmemdup0 xprintf-posix 
xstrndup xvasprintf-posix
 
 # Specification in the form of a few gnulib-tool.m4 macro invocations:
 gl_LOCAL_DIR([local])
@@ -39,6 +39,8 @@ gl_MODULES([
   flexmember
   fopen
   fopen-safer
+  freadptr
+  freadseek
   fseeko
   gendocs
   gettext
@@ -47,6 +49,7 @@ gl_MODULES([
   gnupload
   gpl-3.0
   intprops
+  memchr2
   memmem
   mkstemp
   obstack
diff --git a/m4/input.c b/m4/input.c
index ba2e467..0fb4101 100644
--- a/m4/input.c
+++ b/m4/input.c
@@ -24,6 +24,10 @@
 
 #include "m4private.h"
 
+#include "freadptr.h"
+#include "freadseek.h"
+#include "memchr2.h"
+
 /* Define this to see runtime debug info.  Implied by DEBUG.  */
 /*#define DEBUG_INPUT */
 
@@ -43,9 +47,11 @@
 
    Each input_block has an associated struct input_funcs, which is a
    vtable that defines polymorphic functions for peeking, reading,
-   unget, cleanup, and printing in trace output.  All input is done
-   through the function pointers of the input_funcs on the given
-   input_block, and all characters are unsigned, to distinguish
+   unget, cleanup, and printing in trace output.  Getting a single
+   character at a time is inefficient, so there are also functions for
+   accessing the readahead buffer and consuming bulk input.  All input
+   is done through the function pointers of the input_funcs on the
+   given input_block, and all characters are unsigned, to distinguish
    between stdio EOF and between special sentinel characters.  When a
    input_block is exhausted, its reader returns CHAR_RETRY which
    causes the input_block to be popped from the input_stack.
@@ -94,30 +100,41 @@
 
 typedef struct m4_input_block m4_input_block;
 
-static int     file_peek               (m4_input_block *, m4 *, bool);
-static int     file_read               (m4_input_block *, m4 *, bool, bool,
+static int             file_peek       (m4_input_block *, m4 *, bool);
+static int             file_read       (m4_input_block *, m4 *, bool, bool,
                                         bool);
-static void    file_unget              (m4_input_block *, int);
-static bool    file_clean              (m4_input_block *, m4 *, bool);
-static void    file_print              (m4_input_block *, m4 *, m4_obstack *,
+static void            file_unget      (m4_input_block *, int);
+static bool            file_clean      (m4_input_block *, m4 *, bool);
+static void            file_print      (m4_input_block *, m4 *, m4_obstack *,
                                         int);
-static int     string_peek             (m4_input_block *, m4 *, bool);
-static int     string_read             (m4_input_block *, m4 *, bool, bool,
+static const char *    file_buffer     (m4_input_block *, m4 *, size_t *,
+                                        bool);
+static void            file_consume    (m4_input_block *, m4 *, size_t);
+static int             string_peek     (m4_input_block *, m4 *, bool);
+static int             string_read     (m4_input_block *, m4 *, bool, bool,
                                         bool);
-static void    string_unget            (m4_input_block *, int);
-static void    string_print            (m4_input_block *, m4 *, m4_obstack *,
+static void            string_unget    (m4_input_block *, int);
+static void            string_print    (m4_input_block *, m4 *, m4_obstack *,
                                         int);
-static int     composite_peek          (m4_input_block *, m4 *, bool);
-static int     composite_read          (m4_input_block *, m4 *, bool, bool,
+static const char *    string_buffer   (m4_input_block *, m4 *, size_t *,
                                         bool);
-static void    composite_unget         (m4_input_block *, int);
-static bool    composite_clean         (m4_input_block *, m4 *, bool);
-static void    composite_print         (m4_input_block *, m4 *, m4_obstack *,
+static void            string_consume  (m4_input_block *, m4 *, size_t);
+static int             composite_peek  (m4_input_block *, m4 *, bool);
+static int             composite_read  (m4_input_block *, m4 *, bool, bool,
+                                        bool);
+static void            composite_unget (m4_input_block *, int);
+static bool            composite_clean (m4_input_block *, m4 *, bool);
+static void            composite_print (m4_input_block *, m4 *, m4_obstack *,
                                         int);
-static int     eof_peek                (m4_input_block *, m4 *, bool);
-static int     eof_read                (m4_input_block *, m4 *, bool, bool,
+static const char *    composite_buffer (m4_input_block *, m4 *, size_t *,
+                                         bool);
+static void            composite_consume (m4_input_block *, m4 *, size_t);
+static int             eof_peek        (m4_input_block *, m4 *, bool);
+static int             eof_read        (m4_input_block *, m4 *, bool, bool,
+                                        bool);
+static void            eof_unget       (m4_input_block *, int);
+static const char *    eof_buffer      (m4_input_block *, m4 *, size_t *,
                                         bool);
-static void    eof_unget               (m4_input_block *, int);
 
 static void    init_builtin_token      (m4 *, m4_obstack *,
                                         m4_symbol_value *);
@@ -128,6 +145,8 @@ static      int     next_char               (m4 *, bool, 
bool, bool);
 static int     peek_char               (m4 *, bool);
 static bool    pop_input               (m4 *, bool);
 static void    unget_input             (int);
+static const char * next_buffer        (m4 *, size_t *, bool);
+static void    consume_buffer          (m4 *, size_t);
 static bool    consume_syntax          (m4 *, m4_obstack *, unsigned int);
 
 #ifdef DEBUG_INPUT
@@ -165,6 +184,20 @@ struct input_funcs
   /* Add a representation of the input block to the obstack, for use
      in trace expansion output.  */
   void (*print_func)   (m4_input_block *, m4 *, m4_obstack *, int);
+
+  /* Return a pointer to the current readahead buffer, and set LEN to
+     the length of the result.  If ALLOW_QUOTE, do not return a buffer
+     for a quoted string.  If there is data, but the result of
+     next_char() would not fit in a char (for example, CHAR_EOF or
+     CHAR_QUOTE) or there is no readahead data available, return NULL,
+     and the caller must use next_char().  If there is no more data,
+     return buffer_retry.  The buffer is only valid until the next
+     consume_buffer() or next_char().  */
+  const char *(*buffer_func) (m4_input_block *, m4 *, size_t *, bool);
+
+  /* Optional function to consume data from a readahead buffer
+     previously obtained through buffer_func.  */
+  void (*consume_func) (m4_input_block *, m4 *, size_t);
 };
 
 /* A block of input to be scanned.  */
@@ -235,28 +268,33 @@ static bool input_change;
 
 /* Vtable for handling input from files.  */
 static struct input_funcs file_funcs = {
-  file_peek, file_read, file_unget, file_clean, file_print
+  file_peek, file_read, file_unget, file_clean, file_print, file_buffer,
+  file_consume
 };
 
 /* Vtable for handling input from strings.  */
 static struct input_funcs string_funcs = {
-  string_peek, string_read, string_unget, NULL, string_print
+  string_peek, string_read, string_unget, NULL, string_print, string_buffer,
+  string_consume
 };
 
 /* Vtable for handling input from composite chains.  */
 static struct input_funcs composite_funcs = {
   composite_peek, composite_read, composite_unget, composite_clean,
-  composite_print
+  composite_print, composite_buffer, composite_consume
 };
 
 /* Vtable for recognizing end of input.  */
 static struct input_funcs eof_funcs = {
-  eof_peek, eof_read, eof_unget, NULL, NULL
+  eof_peek, eof_read, eof_unget, NULL, NULL, eof_buffer, NULL
 };
 
 /* Marker at end of an input stack.  */
 static m4_input_block input_eof = { NULL, &eof_funcs, "", 0 };
 
+/* Marker for buffer_func when current block has no more data.  */
+static const char buffer_retry[1];
+
 
 /* Input files, from command line or [s]include.  */
 static int
@@ -354,6 +392,42 @@ file_print (m4_input_block *me, m4 *context 
M4_GNUC_UNUSED, m4_obstack *obs,
   obstack_1grow (obs, '>');
 }
 
+static const char *
+file_buffer (m4_input_block *me, m4 *context M4_GNUC_UNUSED, size_t *len,
+            bool allow_quote M4_GNUC_UNUSED)
+{
+  if (start_of_input_line)
+    {
+      start_of_input_line = false;
+      m4_set_current_line (context, ++me->line);
+    }
+  if (me->u.u_f.end)
+    return buffer_retry;
+  return freadptr (isp->u.u_f.fp, len);
+}
+
+static void
+file_consume (m4_input_block *me, m4 *context, size_t len)
+{
+  const char *buf;
+  const char *p;
+  size_t buf_len;
+  assert (!start_of_input_line);
+  buf = freadptr (me->u.u_f.fp, &buf_len);
+  assert (buf && len <= buf_len);
+  buf_len = 0;
+  while ((p = memchr (buf + buf_len, '\n', len - buf_len)))
+    {
+      if (p == buf + len - 1)
+       start_of_input_line = true;
+      else
+       m4_set_current_line (context, ++me->line);
+      buf_len = p - buf + 1;
+    }
+  if (freadseek (isp->u.u_f.fp, len) != 0)
+    assert (false);
+}
+
 /* m4_push_file () pushes an input file FP with name TITLE on the
   input stack, saving the current file name and line number.  If next
   is non-NULL, this push invalidates a call to m4_push_string_init (),
@@ -439,6 +513,24 @@ string_print (m4_input_block *me, m4 *context, m4_obstack 
*obs,
                           &arg_length);
 }
 
+static const char *
+string_buffer (m4_input_block *me, m4 *context M4_GNUC_UNUSED, size_t *len,
+              bool allow_quote M4_GNUC_UNUSED)
+{
+  if (!me->u.u_s.len)
+    return buffer_retry;
+  *len = me->u.u_s.len;
+  return me->u.u_s.str;
+}
+
+static void
+string_consume (m4_input_block *me, m4 *context M4_GNUC_UNUSED, size_t len)
+{
+  assert (len <= me->u.u_s.len);
+  me->u.u_s.len -= len;
+  me->u.u_s.str += len;
+}
+
 /* First half of m4_push_string ().  The pointer next points to the
    new input_block.  FILE and LINE describe the location where the
    macro starts that is generating the expansion (even if the location
@@ -904,6 +996,63 @@ composite_print (m4_input_block *me, m4 *context, 
m4_obstack *obs,
     m4_shipout_string (context, obs, quotes->str2, quotes->len2, false);
 }
 
+static const char *
+composite_buffer (m4_input_block *me, m4 *context, size_t *len,
+                 bool allow_quote)
+{
+  m4__symbol_chain *chain = me->u.u_c.chain;
+  while (chain)
+    {
+      if (allow_quote && chain->quote_age == m4__quote_age (M4SYNTAX))
+       return NULL; /* CHAR_QUOTE doesn't fit in buffer.  */
+      switch (chain->type)
+       {
+       case M4__CHAIN_STR:
+         if (chain->u.u_s.len)
+           {
+             *len = chain->u.u_s.len;
+             return chain->u.u_s.str;
+           }
+         if (chain->u.u_s.level < SIZE_MAX)
+           m4__adjust_refcount (context, chain->u.u_s.level, false);
+         break;
+       case M4__CHAIN_FUNC:
+         if (chain->u.builtin)
+           return NULL; /* CHAR_BUILTIN doesn't fit in buffer.  */
+         break;
+       case M4__CHAIN_ARGV:
+         if (chain->u.u_a.index == m4_arg_argc (chain->u.u_a.argv))
+           {
+             m4__arg_adjust_refcount (context, chain->u.u_a.argv, false);
+             break;
+           }
+         return NULL; /* No buffer to provide.  */
+       case M4__CHAIN_LOC:
+         me->file = chain->u.u_l.file;
+         me->line = chain->u.u_l.line;
+         input_change = true;
+         me->u.u_c.chain = chain->next;
+         return next_buffer (context, len, allow_quote);
+       default:
+         assert (!"composite_buffer");
+         abort ();
+       }
+      me->u.u_c.chain = chain = chain->next;
+    }
+  return buffer_retry;
+}
+
+static void
+composite_consume (m4_input_block *me, m4 *context M4_GNUC_UNUSED, size_t len)
+{
+  m4__symbol_chain *chain = me->u.u_c.chain;
+  assert (chain && chain->type == M4__CHAIN_STR && len <= chain->u.u_s.len);
+  /* Partial consumption invalidates quote age.  */
+  chain->quote_age = 0;
+  chain->u.u_s.len -= len;
+  chain->u.u_s.str += len;
+}
+
 /* Given an obstack OBS, capture any unfinished text as a link in the
    chain that starts at *START and ends at *END.  START may be NULL if
    *END is non-NULL.  */
@@ -1001,6 +1150,13 @@ eof_unget (m4_input_block *me M4_GNUC_UNUSED, int ch)
   assert (ch == CHAR_EOF);
 }
 
+static const char *
+eof_buffer (m4_input_block *me M4_GNUC_UNUSED, m4 *context M4_GNUC_UNUSED,
+           size_t *len M4_GNUC_UNUSED, bool allow_unget M4_GNUC_UNUSED)
+{
+  return NULL;
+}
+
 
 /* When tracing, print a summary of the contents of the input block
    created by push_string_init/push_string_finish to OBS.  Use
@@ -1340,6 +1496,50 @@ unget_input (int ch)
   isp->funcs->unget_func (isp, ch);
 }
 
+/* Return a pointer to the available bytes of the current input block,
+   and set *LEN to the length of the result.  If ALLOW_QUOTE, do not
+   return a buffer for a quoted string.  If the result does not fit in
+   a char (for example, CHAR_EOF or CHAR_QUOTE), or if there is no
+   readahead data available, return NULL, and the caller must fall
+   back to next_char().  The buffer is only valid until the next
+   consume_buffer() or next_char().  */
+static const char *
+next_buffer (m4 *context, size_t *len, bool allow_quote)
+{
+  const char *buf;
+  while (1)
+    {
+      assert (isp);
+      if (input_change)
+       {
+         m4_set_current_file (context, isp->file);
+         m4_set_current_line (context, isp->line);
+         input_change = false;
+       }
+
+      assert (isp->funcs->buffer_func);
+      buf = isp->funcs->buffer_func (isp, context, len, allow_quote);
+      if (buf != buffer_retry)
+       return buf;
+      /* End of input source --- pop one level.  */
+      pop_input (context, true);
+    }
+}
+
+/* Consume LEN bytes from the current input block, as though by LEN
+   calls to next_char().  LEN must be less than or equal to the
+   previous length returned by a successful call to next_buffer().  */
+static void
+consume_buffer (m4 *context, size_t len)
+{
+  assert (isp && !input_change);
+  if (len)
+    {
+      assert (isp->funcs->consume_func);
+      isp->funcs->consume_func (isp, context, len);
+    }
+}
+
 /* skip_line () simply discards all immediately following characters,
    up to the first newline.  It is only used from m4_dnl ().  Report
    errors on behalf of CALLER.  */
@@ -1348,9 +1548,28 @@ m4_skip_line (m4 *context, const m4_call_info *caller)
 {
   int ch;
 
-  while ((ch = next_char (context, false, false, false)) != CHAR_EOF
-        && ch != '\n')
-    ;
+  while (1)
+    {
+      size_t len;
+      const char *buffer = next_buffer (context, &len, false);
+      if (buffer)
+       {
+         const char *p = (char *) memchr (buffer, '\n', len);
+         if (p)
+           {
+             consume_buffer (context, p - buffer + 1);
+             ch = '\n';
+             break;
+           }
+         consume_buffer (context, len);
+       }
+      else
+       {
+         ch = next_char (context, false, false, false);
+         if (ch == CHAR_EOF || ch == '\n')
+           break;
+       }
+    }
   if (ch == CHAR_EOF)
     m4_warn (context, 0, caller, _("end of file treated as newline"));
 }
@@ -1377,16 +1596,26 @@ match_input (m4 *context, const char *s, size_t len, 
bool consume)
   const char *t;
   m4_obstack *st;
   bool result = false;
+  size_t buf_len;
 
   if (consume)
     {
       s++;
       len--;
     }
+  /* Try a buffer match first.  */
   assert (len);
+  t = next_buffer (context, &buf_len, false);
+  if (t && len <= buf_len && memcmp (s, t, len) == 0)
+    {
+      if (consume)
+       consume_buffer (context, len);
+      return true;
+    }
+  /* Fall back on byte matching.  */
   ch = peek_char (context, false);
   if (ch != to_uchar (*s))
-    return false;                      /* fail */
+    return false;
 
   if (len == 1)
     {
@@ -1417,9 +1646,10 @@ match_input (m4 *context, const char *s, size_t len, 
bool consume)
   return result;
 }
 
-/* The macro MATCH() is used to match a string S of length LEN against
-   the input.  The first character is handled inline for speed, and
-   S[LEN] must be safe to dereference (it is faster to do character
+/* Check whether the current input matches a delimiter, which either
+   belongs to syntax category CAT or matches the string S of length
+   LEN.  The first character is handled inline for speed, and S[LEN]
+   must be safe to dereference (it is faster to do character
    comparison prior to length checks).  This improves efficiency for
    the common case of single character quotes and comment delimiters,
    while being safe for disabled delimiters as well as longer
@@ -1427,9 +1657,10 @@ match_input (m4 *context, const char *s, size_t len, 
bool consume)
    successful match will discard the matched string.  Otherwise, CH is
    the result of peek_char, and the input stream is effectively
    unchanged.  */
-#define MATCH(C, ch, s, len, consume)                                  \
-  (to_uchar ((s)[0]) == (ch)                                           \
-   && ((len) >> 1 ? match_input (C, s, len, consume) : (len)))
+#define MATCH(C, ch, cat, s, len, consume)                             \
+  (m4_has_syntax (m4_get_syntax_table (C), ch, cat)                    \
+   || (to_uchar ((s)[0]) == (ch)                                       \
+       && ((len) >> 1 ? match_input (C, s, len, consume) : (len))))
 
 /* While the current input character has the given SYNTAX, append it
    to OBS.  Take care not to pop input source unless the next source
@@ -1443,20 +1674,37 @@ consume_syntax (m4 *context, m4_obstack *obs, unsigned 
int syntax)
   assert (syntax);
   while (1)
     {
-      /* It is safe to call next_char without first checking
-        peek_char, except at input source boundaries, which we detect
-        by CHAR_RETRY.  We exploit the fact that CHAR_EOF,
-        CHAR_BUILTIN, CHAR_QUOTE, and CHAR_ARGV do not satisfy any
-        syntax categories.  */
-      while ((ch = next_char (context, allow, allow, true)) != CHAR_RETRY
-            && m4_has_syntax (M4SYNTAX, ch, syntax))
+      /* Start with a buffer search.  */
+      size_t len;
+      const char *buffer = next_buffer (context, &len, allow);
+      if (buffer)
+       {
+         const char *p = buffer;
+         while (len && m4_has_syntax (M4SYNTAX, *p, syntax))
+           {
+             len--;
+             p++;
+           }
+         obstack_grow (obs, buffer, p - buffer);
+         consume_buffer (context, p - buffer);
+         if (len)
+           return false;
+       }
+      /* Fall back to byte-wise search.  It is safe to call next_char
+        without first checking peek_char, except at input source
+        boundaries, which we detect by CHAR_RETRY.  */
+      ch = next_char (context, allow, allow, true);
+      if (ch < CHAR_EOF && m4_has_syntax (M4SYNTAX, ch, syntax))
        {
-         assert (ch < CHAR_EOF);
          obstack_1grow (obs, ch);
+         continue;
        }
       if (ch == CHAR_RETRY || ch == CHAR_QUOTE || ch == CHAR_ARGV)
        {
          ch = peek_char (context, false);
+         /* We exploit the fact that CHAR_EOF, CHAR_BUILTIN,
+            CHAR_QUOTE, and CHAR_ARGV do not satisfy any syntax
+            categories.  */
          if (m4_has_syntax (M4SYNTAX, ch, syntax))
            {
              assert (ch < CHAR_EOF);
@@ -1600,58 +1848,74 @@ m4__next_token (m4 *context, m4_symbol_value *token, 
int *line,
        obstack_1grow (obs_safe, ch);
        consume_syntax (context, obs_safe, M4_SYNTAX_ALPHA | M4_SYNTAX_NUM);
       }
-    else if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_LQUOTE))
-      {                                        /* QUOTED STRING, SINGLE QUOTES 
*/
+    else if (MATCH (context, ch, M4_SYNTAX_LQUOTE,
+                   context->syntax->quote.str1,
+                   context->syntax->quote.len1, true))
+      {                                        /* QUOTED STRING */
        if (obs)
          obs_safe = obs;
        quote_level = 1;
        type = M4_TOKEN_STRING;
        while (1)
          {
-           ch = next_char (context, obs && m4__quote_age (M4SYNTAX), false,
-                           false);
-           if (ch == CHAR_EOF)
+           /* Start with buffer search for either potential delimiter.  */
+           size_t len;
+           const char *buffer = next_buffer (context, &len,
+                                             obs && m4__quote_age (M4SYNTAX));
+           if (buffer)
              {
-               if (!caller)
+               const char *p = buffer;
+               if (m4_is_syntax_single_quotes (M4SYNTAX))
+                 do
+                   {
+                     p = (char *) memchr2 (p, *context->syntax->quote.str1,
+                                           *context->syntax->quote.str2,
+                                           buffer + len - p);
+                   }
+                 while (p && m4__quote_age (M4SYNTAX)
+                        && (*p++ == *context->syntax->quote.str2
+                            ? --quote_level : ++quote_level));
+               else
                  {
-                   assert (line);
-                   m4_set_current_file (context, file);
-                   m4_set_current_line (context, *line);
+                   size_t remaining = len;
+                   assert (context->syntax->quote.len1 == 1
+                           && context->syntax->quote.len2 == 1);
+                   while (remaining && !m4_has_syntax (M4SYNTAX, *p,
+                                                       (M4_SYNTAX_LQUOTE
+                                                        | M4_SYNTAX_RQUOTE)))
+                     {
+                       p++;
+                       remaining--;
+                     }
+                   if (!remaining)
+                     p = NULL;
+                 }
+               if (p)
+                 {
+                   if (m4__quote_age (M4SYNTAX))
+                     {
+                       assert (!quote_level
+                               && context->syntax->quote.len1 == 1
+                               && context->syntax->quote.len2 == 1);
+                       obstack_grow (obs_safe, buffer, p - buffer - 1);
+                       consume_buffer (context, p - buffer);
+                       break;
+                     }
+                   obstack_grow (obs_safe, buffer, p - buffer);
+                   ch = to_uchar (*p);
+                   consume_buffer (context, p - buffer + 1);
+                 }
+               else
+                 {
+                   obstack_grow (obs_safe, buffer, len);
+                   consume_buffer (context, len);
+                   continue;
                  }
-               m4_error (context, EXIT_FAILURE, 0, caller,
-                         _("end of file in string"));
-             }
-           if (ch == CHAR_BUILTIN)
-             init_builtin_token (context, obs, obs ? token : NULL);
-           else if (ch == CHAR_QUOTE)
-             append_quote_token (context, obs, token);
-           else if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_RQUOTE))
-             {
-               if (--quote_level == 0)
-                 break;
-               obstack_1grow (obs_safe, ch);
-             }
-           else if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_LQUOTE))
-             {
-               quote_level++;
-               obstack_1grow (obs_safe, ch);
              }
+           /* Fall back to byte-wise search.  */
            else
-             obstack_1grow (obs_safe, ch);
-         }
-      }
-    else if (!m4_is_syntax_single_quotes (M4SYNTAX)
-            && MATCH (context, ch, context->syntax->quote.str1,
-                      context->syntax->quote.len1, true))
-      {                                        /* QUOTED STRING, LONGER QUOTES 
*/
-       if (obs)
-         obs_safe = obs;
-       quote_level = 1;
-       type = M4_TOKEN_STRING;
-       assert (!m4__quote_age (M4SYNTAX));
-       while (1)
-         {
-           ch = next_char (context, false, false, false);
+             ch = next_char (context, obs && m4__quote_age (M4SYNTAX), false,
+                             false);
            if (ch == CHAR_EOF)
              {
                if (!caller)
@@ -1665,71 +1929,87 @@ m4__next_token (m4 *context, m4_symbol_value *token, 
int *line,
              }
            if (ch == CHAR_BUILTIN)
              init_builtin_token (context, obs, obs ? token : NULL);
-           else if (MATCH (context, ch, context->syntax->quote.str2,
+           else if (ch == CHAR_QUOTE)
+             append_quote_token (context, obs, token);
+           else if (MATCH (context, ch, M4_SYNTAX_RQUOTE,
+                           context->syntax->quote.str2,
                            context->syntax->quote.len2, true))
              {
                if (--quote_level == 0)
                  break;
-               obstack_grow (obs_safe, context->syntax->quote.str2,
-                             context->syntax->quote.len2);
+               if (1 < context->syntax->quote.len2)
+                 obstack_grow (obs_safe, context->syntax->quote.str2,
+                               context->syntax->quote.len2);
+               else
+                 obstack_1grow (obs_safe, ch);
              }
-           else if (MATCH (context, ch, context->syntax->quote.str1,
+           else if (MATCH (context, ch, M4_SYNTAX_LQUOTE,
+                           context->syntax->quote.str1,
                            context->syntax->quote.len1, true))
              {
                quote_level++;
-               obstack_grow (obs_safe, context->syntax->quote.str1,
-                             context->syntax->quote.len1);
+               if (1 < context->syntax->quote.len1)
+                 obstack_grow (obs_safe, context->syntax->quote.str1,
+                               context->syntax->quote.len1);
+               else
+                 obstack_1grow (obs_safe, ch);
              }
            else
              obstack_1grow (obs_safe, ch);
          }
       }
-    else if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_BCOMM))
-      {                                        /* COMMENT, SHORT DELIM */
+    else if (MATCH (context, ch, M4_SYNTAX_BCOMM,
+                   context->syntax->comm.str1,
+                   context->syntax->comm.len1, true))
+      {                                        /* COMMENT */
        if (obs && !m4_get_discard_comments_opt (context))
          obs_safe = obs;
-       obstack_1grow (obs_safe, ch);
+       if (1 < context->syntax->comm.len1)
+         obstack_grow (obs_safe, context->syntax->comm.str1,
+                       context->syntax->comm.len1);
+       else
+         obstack_1grow (obs_safe, ch);
        while (1)
          {
-           ch = next_char (context, false, false, false);
-           if (ch == CHAR_EOF)
+           /* Start with buffer search for potential end delimiter.  */
+           size_t len;
+           const char *buffer = next_buffer (context, &len, false);
+           if (buffer)
              {
-               if (!caller)
+               const char *p;
+               if (m4_is_syntax_single_comments (M4SYNTAX))
+                 p = (char *) memchr (buffer, *context->syntax->comm.str2,
+                                      len);
+               else
                  {
-                   assert (line);
-                   m4_set_current_file (context, file);
-                   m4_set_current_line (context, *line);
+                   size_t remaining = len;
+                   assert (context->syntax->comm.len2 == 1);
+                   p = buffer;
+                   while (remaining
+                          && !m4_has_syntax (M4SYNTAX, *p, M4_SYNTAX_ECOMM))
+                     {
+                       p++;
+                       remaining--;
+                     }
+                   if (!remaining)
+                     p = NULL;
+                 }
+               if (p)
+                 {
+                   obstack_grow (obs_safe, buffer, p - buffer);
+                   ch = to_uchar (*p);
+                   consume_buffer (context, p - buffer + 1);
+                 }
+               else
+                 {
+                   obstack_grow (obs_safe, buffer, len);
+                   consume_buffer (context, len);
+                   continue;
                  }
-               m4_error (context, EXIT_FAILURE, 0, caller,
-                         _("end of file in comment"));
-             }
-           if (ch == CHAR_BUILTIN)
-             {
-               init_builtin_token (context, NULL, NULL);
-               continue;
-             }
-           if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_ECOMM))
-             {
-               obstack_1grow (obs_safe, ch);
-               break;
              }
-           assert (ch < CHAR_EOF);
-           obstack_1grow (obs_safe, ch);
-         }
-       type = (m4_get_discard_comments_opt (context)
-               ? M4_TOKEN_NONE : M4_TOKEN_COMMENT);
-      }
-    else if (!m4_is_syntax_single_comments (M4SYNTAX)
-            && MATCH (context, ch, context->syntax->comm.str1,
-                      context->syntax->comm.len1, true))
-      {                                        /* COMMENT, LONGER DELIM */
-       if (obs && !m4_get_discard_comments_opt (context))
-         obs_safe = obs;
-       obstack_grow (obs_safe, context->syntax->comm.str1,
-                     context->syntax->comm.len1);
-       while (1)
-         {
-           ch = next_char (context, false, false, false);
+           /* Fall back to byte-wise search.  */
+           else
+             ch = next_char (context, false, false, false);
            if (ch == CHAR_EOF)
              {
                if (!caller)
@@ -1746,11 +2026,15 @@ m4__next_token (m4 *context, m4_symbol_value *token, 
int *line,
                init_builtin_token (context, NULL, NULL);
                continue;
              }
-           if (MATCH (context, ch, context->syntax->comm.str2,
+           if (MATCH (context, ch, M4_SYNTAX_ECOMM,
+                      context->syntax->comm.str2,
                       context->syntax->comm.len2, true))
              {
-               obstack_grow (obs_safe, context->syntax->comm.str2,
-                             context->syntax->comm.len2);
+               if (1 < context->syntax->comm.len2)
+                 obstack_grow (obs_safe, context->syntax->comm.str2,
+                               context->syntax->comm.len2);
+               else
+                 obstack_1grow (obs_safe, ch);
                break;
              }
            assert (ch < CHAR_EOF);
@@ -1779,12 +2063,10 @@ m4__next_token (m4 *context, m4_symbol_value *token, 
int *line,
        obstack_1grow (&token_stack, ch);
        type = M4_TOKEN_CLOSE;
       }
-    else if (m4_is_syntax_single_quotes (M4SYNTAX)
-            && m4_is_syntax_single_comments (M4SYNTAX))
-      {                        /* EVERYTHING ELSE (SHORT QUOTES AND COMMENTS) 
*/
+    else
+      {                                        /* EVERYTHING ELSE */
        assert (ch < CHAR_EOF);
        obstack_1grow (&token_stack, ch);
-
        if (m4_has_syntax (M4SYNTAX, ch,
                           (M4_SYNTAX_OTHER | M4_SYNTAX_NUM | M4_SYNTAX_DOLLAR
                            | M4_SYNTAX_LBRACE | M4_SYNTAX_RBRACE)))
@@ -1794,10 +2076,11 @@ m4__next_token (m4 *context, m4_symbol_value *token, 
int *line,
                obs_safe = obs;
                obstack_1grow (obs, ch);
              }
-           consume_syntax (context, obs_safe,
-                           (M4_SYNTAX_OTHER | M4_SYNTAX_NUM
-                            | M4_SYNTAX_DOLLAR | M4_SYNTAX_LBRACE
-                            | M4_SYNTAX_RBRACE));
+           if (m4__safe_quotes (M4SYNTAX))
+             consume_syntax (context, obs_safe,
+                             (M4_SYNTAX_OTHER | M4_SYNTAX_NUM
+                              | M4_SYNTAX_DOLLAR | M4_SYNTAX_LBRACE
+                              | M4_SYNTAX_RBRACE));
            type = M4_TOKEN_STRING;
          }
        else if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_SPACE))
@@ -1805,34 +2088,14 @@ m4__next_token (m4 *context, m4_symbol_value *token, 
int *line,
            /* Coalescing newlines when interactive or when synclines
               are enabled is wrong.  */
            if (!m4_get_interactive_opt (context)
-               && !m4_get_syncoutput_opt (context))
+               && !m4_get_syncoutput_opt (context)
+               && m4__safe_quotes (M4SYNTAX))
              consume_syntax (context, &token_stack, M4_SYNTAX_SPACE);
            type = M4_TOKEN_SPACE;
          }
        else
          type = M4_TOKEN_SIMPLE;
       }
-    else               /* EVERYTHING ELSE (LONG QUOTES OR COMMENTS) */
-      {
-       assert (ch < CHAR_EOF);
-       obstack_1grow (&token_stack, ch);
-
-       if (m4_has_syntax (M4SYNTAX, ch,
-                          (M4_SYNTAX_OTHER | M4_SYNTAX_NUM | M4_SYNTAX_DOLLAR
-                           | M4_SYNTAX_LBRACE | M4_SYNTAX_RBRACE)))
-         {
-           if (obs)
-             {
-               obs_safe = obs;
-               obstack_1grow (obs, ch);
-             }
-           type = M4_TOKEN_STRING;
-         }
-       else if (m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_SPACE))
-         type = M4_TOKEN_SPACE;
-       else
-         type = M4_TOKEN_SIMPLE;
-      }
   } while (type == M4_TOKEN_NONE);
 
   if (token->type == M4_SYMBOL_VOID)
@@ -1882,12 +2145,10 @@ m4__next_token_is_open (m4 *context)
       || m4_has_syntax (M4SYNTAX, ch, (M4_SYNTAX_BCOMM | M4_SYNTAX_ESCAPE
                                       | M4_SYNTAX_ALPHA | M4_SYNTAX_LQUOTE
                                       | M4_SYNTAX_ACTIVE))
-      || (!m4_is_syntax_single_comments (M4SYNTAX)
-         && MATCH (context, ch, context->syntax->comm.str1,
-                   context->syntax->comm.len1, false))
-      || (!m4_is_syntax_single_quotes (M4SYNTAX)
-         && MATCH (context, ch, context->syntax->quote.str1,
-                   context->syntax->quote.len1, false)))
+      || (MATCH (context, ch, M4_SYNTAX_BCOMM, context->syntax->comm.str1,
+                context->syntax->comm.len1, false))
+      || (MATCH (context, ch, M4_SYNTAX_LQUOTE, context->syntax->quote.str1,
+                context->syntax->quote.len1, false)))
     return false;
   return m4_has_syntax (M4SYNTAX, ch, M4_SYNTAX_OPEN);
 }
diff --git a/m4/m4module.h b/m4/m4module.h
index 07f8c1a..c94f56a 100644
--- a/m4/m4module.h
+++ b/m4/m4module.h
@@ -484,8 +484,12 @@ enum {
   M4_SYNTAX_ECOMM              = 1 << 15
 };
 
+/* Mask of attribute syntax categories.  */
 #define M4_SYNTAX_MASKS                (M4_SYNTAX_RQUOTE | M4_SYNTAX_ECOMM)
-#define M4_SYNTAX_VALUE                (~(M4_SYNTAX_RQUOTE | M4_SYNTAX_ECOMM))
+/* Mask of basic syntax categories where any change requires a
+   recomputation of the overall syntax characteristics.  */
+#define M4_SYNTAX_SUSPECT      (M4_SYNTAX_LQUOTE | M4_SYNTAX_BCOMM     \
+                                | M4_SYNTAX_ESCAPE)
 
 #define m4_syntab(S, C)                ((S)->table[(C)])
 /* Determine if character C matches any of the bitwise-or'd syntax
diff --git a/m4/m4private.h b/m4/m4private.h
index 49fba3b..4f26979 100644
--- a/m4/m4private.h
+++ b/m4/m4private.h
@@ -1,6 +1,6 @@
 /* GNU m4 -- A simple macro processor
    Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 1998, 1999, 2004,
-   2005, 2006, 2007, 2008 Free Software Foundation, Inc.
+   2005, 2006, 2007, 2008, 2009 Free Software Foundation, Inc.
 
    This file is part of GNU M4.
 
@@ -472,17 +472,19 @@ struct m4_syntax_table {
   m4_string_pair quote;        /* Quote delimiters.  */
   m4_string_pair comm; /* Comment delimiters.  */
 
-  /* True iff strlen(lquote) == strlen(rquote) == 1 and lquote is not
-     interfering with macro names.  */
+  /* True iff only one start and end quote delimiter exist.  */
   bool_bitfield is_single_quotes : 1;
 
-  /* True iff strlen(bcomm) == strlen(ecomm) == 1 and bcomm is not
-     interfering with macros or quotes.  */
+  /* True iff only one start and end comment delimiter exist.  */
   bool_bitfield is_single_comments : 1;
 
   /* True iff some character has M4_SYNTAX_ESCAPE.  */
   bool_bitfield is_macro_escaped : 1;
 
+  /* True iff a changesyntax call has impacted something that requires
+     cleanup at the end.  */
+  bool_bitfield suspect : 1;
+
   /* Track the number of changesyntax calls.  This saturates at
      0xffff, so the idea is that most users won't be changing the
      syntax that frequently; perhaps in the future we will cache
diff --git a/m4/syntax.c b/m4/syntax.c
index 1fb4815..0949055 100644
--- a/m4/syntax.c
+++ b/m4/syntax.c
@@ -1,6 +1,6 @@
 /* GNU m4 -- A simple macro processor
    Copyright (C) 1989, 1990, 1991, 1992, 1993, 1994, 2002, 2004, 2006,
-   2007, 2008 Free Software Foundation, Inc.
+   2007, 2008, 2009 Free Software Foundation, Inc.
 
    This file is part of GNU M4.
 
@@ -31,6 +31,7 @@
    according to a syntax table.  The character groups are (definitions
    are all in m4.h, those marked with a * are not yet in use):
 
+   Basic (all characters fall in one of these mutually exclusive bins)
    M4_SYNTAX_IGNORE    *Character to be deleted from input as if not present
    M4_SYNTAX_OTHER     Any character with no special meaning to m4
    M4_SYNTAX_SPACE     Whitespace (ignored when leading macro arguments)
@@ -46,12 +47,12 @@
    M4_SYNTAX_ALPHA     Alphabetic characters (can start macro names)
    M4_SYNTAX_NUM       Numeric characters (can form macro names)
 
-   M4_SYNTAX_LQUOTE    A single characters left quote
-   M4_SYNTAX_BCOMM     A single characters begin comment delimiter
+   M4_SYNTAX_LQUOTE    A single character left quote
+   M4_SYNTAX_BCOMM     A single character begin comment delimiter
 
-   (These are bit masks)
-   M4_SYNTAX_RQUOTE    A single characters right quote
-   M4_SYNTAX_ECOMM     A single characters end comment delimiter
+   Attribute (these are context sensitive, and exist in addition to basic)
+   M4_SYNTAX_RQUOTE    A single character right quote
+   M4_SYNTAX_ECOMM     A single character end comment delimiter
 
    Besides adding new facilities, the use of a syntax table will reduce
    the number of calls to next_token ().  Now groups of OTHER, NUM and
@@ -65,15 +66,10 @@
    "changesyntax" allows the the user to change the category of any
    character.
 
-   Default '\n' is both ECOMM and SPACE, depending on the context.  To
-   solve the problem of quotes and comments that have diffent syntax
-   code based on the context, the RQUOTE and ECOMM codes are bit
-   masks to add to an ordinary code.  If a character is made a quote it
-   will be recognised if the basis code does not have precedence.
-
-   When changing quotes and comment delimiters only the bits are
-   removed, and the characters are therefore reverted to its old
-   category code.
+   By default, '\n' is both ECOMM and SPACE, depending on the context.
+   Hence we have basic categories (mutually exclusive, can introduce a
+   context, and can be empty sets), and attribute categories
+   (additive, only recognized in context, and will never be empty).
 
    The precedence as implemented by next_token () is:
 
@@ -100,13 +96,27 @@
    a string is parsed equally whether there is a $ or not.  These characters
    are instead used during user macro expansion.
 
-   M4_SYNTAX_RQUOTE and M4_SYNTAX_ECOMM do not start tokens.  */
 
-static bool check_is_single_quotes     (m4_syntax_table *);
-static bool check_is_single_comments   (m4_syntax_table *);
-static bool check_is_macro_escaped     (m4_syntax_table *);
-static int add_syntax_attribute                (m4_syntax_table *, int, int);
-static int remove_syntax_attribute     (m4_syntax_table *, int, int);
+   M4_SYNTAX_RQUOTE and M4_SYNTAX_ECOMM do not start tokens.
+
+   There are several optimizations that can be performed depending on
+   known states of the syntax table.  For example, when searching for
+   quotes, if there is only a single start quote and end quote
+   delimiter, we can use memchr2 and search a word at a time, instead
+   of performing a table lookup a byte at a time.  The is_single_*
+   flags track whether quotes and comments have a single delimiter
+   (always the case if changequote/changecom were used, and
+   potentially the case after changesyntax).  Since we frequently need
+   to access quotes, we store the oldest valid quote outside the
+   lookup table; the suspect flag tracks whether a cleanup pass is
+   needed to restore our invariants.  On the other hand, coalescing
+   multiple M4_SYNTAX_OTHER bytes could form a delimiter, so many
+   optimizations must be disabled if a multi-byte delimiter exists;
+   this is handled by m4__safe_quotes.  Meanwhile, quotes and comments
+   can be disabled if the leading delimiter is length 0.  */
+
+static int add_syntax_attribute                (m4_syntax_table *, char, int);
+static int remove_syntax_attribute     (m4_syntax_table *, char, int);
 static void set_quote_age              (m4_syntax_table *, bool, bool);
 
 m4_syntax_table *
@@ -217,35 +227,44 @@ m4_syntax_code (char ch)
 
 /* Functions to manipulate the syntax table.  */
 static int
-add_syntax_attribute (m4_syntax_table *syntax, int ch, int code)
+add_syntax_attribute (m4_syntax_table *syntax, char ch, int code)
 {
+  int c = to_uchar (ch);
   if (code & M4_SYNTAX_MASKS)
-    syntax->table[ch] |= code;
+    {
+      syntax->table[c] |= code;
+      syntax->suspect = true;
+    }
   else
-    syntax->table[ch] = (syntax->table[ch] & M4_SYNTAX_MASKS) | code;
+    {
+      if ((code & (M4_SYNTAX_SUSPECT)) != 0
+         || m4_has_syntax (syntax, c, M4_SYNTAX_SUSPECT))
+       syntax->suspect = true;
+      syntax->table[c] = ((syntax->table[c] & M4_SYNTAX_MASKS) | code);
+    }
 
 #ifdef DEBUG_SYNTAX
-  xfprintf(stderr, "Set syntax %o %c = %04X\n",
-          ch, isprint(ch) ? ch : '-',
-          syntax->table[ch]);
+  xfprintf(stderr, "Set syntax %o %c = %04X\n", c, isprint(c) ? c : '-',
+          syntax->table[c]);
 #endif
 
-  return syntax->table[ch];
+  return syntax->table[c];
 }
 
 static int
-remove_syntax_attribute (m4_syntax_table *syntax, int ch, int code)
+remove_syntax_attribute (m4_syntax_table *syntax, char ch, int code)
 {
+  int c = to_uchar (ch);
   assert (code & M4_SYNTAX_MASKS);
-  syntax->table[ch] &= ~code;
+  syntax->table[c] &= ~code;
+  syntax->suspect = true;
 
 #ifdef DEBUG_SYNTAX
-  xfprintf(stderr, "Unset syntax %o %c = %04X\n",
-          ch, isprint(ch) ? ch : '-',
-          syntax->table[ch]);
+  xfprintf(stderr, "Unset syntax %o %c = %04X\n", c, isprint(c) ? c : '-',
+          syntax->table[c]);
 #endif
 
-  return syntax->table[ch];
+  return syntax->table[c];
 }
 
 /* Add the set CHARS of length LEN to syntax category CODE, removing
@@ -254,21 +273,8 @@ static void
 add_syntax_set (m4_syntax_table *syntax, const char *chars, size_t len,
                int code)
 {
-  int ch;
-
-  if (!len)
-    return;
-
-  if (code == M4_SYNTAX_ESCAPE)
-    syntax->is_macro_escaped = true;
-
-  /* Adding doesn't affect single-quote or single-comment.  */
-
   while (len--)
-    {
-      ch = to_uchar (*chars++);
-      add_syntax_attribute (syntax, ch, code);
-    }
+    add_syntax_attribute (syntax, *chars++, code);
 }
 
 /* Remove the set CHARS of length LEN from syntax category CODE,
@@ -277,43 +283,14 @@ static void
 subtract_syntax_set (m4_syntax_table *syntax, const char *chars, size_t len,
                     int code)
 {
-  int ch;
-
-  if (!len)
-    return;
-
   while (len--)
     {
-      ch = to_uchar (*chars++);
+      char ch = *chars++;
       if ((code & M4_SYNTAX_MASKS) != 0)
        remove_syntax_attribute (syntax, ch, code);
       else if (m4_has_syntax (syntax, ch, code))
        add_syntax_attribute (syntax, ch, M4_SYNTAX_OTHER);
     }
-
-  /* Check for any cleanup needed.  */
-  switch (code)
-    {
-    case M4_SYNTAX_ESCAPE:
-      if (syntax->is_macro_escaped)
-       check_is_macro_escaped (syntax);
-      break;
-
-    case M4_SYNTAX_LQUOTE:
-    case M4_SYNTAX_RQUOTE:
-      if (syntax->is_single_quotes)
-       check_is_single_quotes (syntax);
-      break;
-
-    case M4_SYNTAX_BCOMM:
-    case M4_SYNTAX_ECOMM:
-      if (syntax->is_single_comments)
-       check_is_single_comments (syntax);
-      break;
-
-    default:
-      break;
-    }
 }
 
 /* Make the set CHARS of length LEN become syntax category CODE,
@@ -330,21 +307,16 @@ set_syntax_set (m4_syntax_table *syntax, const char 
*chars, size_t len,
      OTHER.  */
   for (ch = UCHAR_MAX + 1; --ch >= 0; )
     {
-      if (code == M4_SYNTAX_RQUOTE || code == M4_SYNTAX_ECOMM)
+      if ((code & M4_SYNTAX_MASKS) != 0)
        remove_syntax_attribute (syntax, ch, code);
       else if (m4_has_syntax (syntax, ch, code))
        add_syntax_attribute (syntax, ch, M4_SYNTAX_OTHER);
     }
   while (len--)
     {
-      ch = to_uchar (*chars++);
+      ch = *chars++;
       add_syntax_attribute (syntax, ch, code);
     }
-
-  /* Check for any cleanup needed.  */
-  check_is_macro_escaped (syntax);
-  check_is_single_quotes (syntax);
-  check_is_single_comments (syntax);
 }
 
 /* Reset syntax category CODE to its default state, sending all other
@@ -375,9 +347,6 @@ reset_syntax_set (m4_syntax_table *syntax, int code)
       else if (syntax->orig[ch] == code || m4_has_syntax (syntax, ch, code))
        add_syntax_attribute (syntax, ch, syntax->orig[ch]);
     }
-  check_is_macro_escaped (syntax);
-  check_is_single_quotes (syntax);
-  check_is_single_comments (syntax);
 }
 
 /* Reset the syntax table to its default state.  */
@@ -403,10 +372,8 @@ m4_reset_syntax (m4_syntax_table *syntax)
   syntax->comm.str2 = xmemdup0 (DEF_ECOMM, 1);
   syntax->comm.len2 = 1;
 
-  add_syntax_attribute (syntax, to_uchar (syntax->quote.str2[0]),
-                       M4_SYNTAX_RQUOTE);
-  add_syntax_attribute (syntax, to_uchar (syntax->comm.str2[0]),
-                       M4_SYNTAX_ECOMM);
+  add_syntax_attribute (syntax, syntax->quote.str2[0], M4_SYNTAX_RQUOTE);
+  add_syntax_attribute (syntax, syntax->comm.str2[0], M4_SYNTAX_ECOMM);
 
   syntax->is_single_quotes = true;
   syntax->is_single_comments = true;
@@ -431,6 +398,7 @@ m4_set_syntax (m4_syntax_table *syntax, char key, char 
action,
     {
       return -1;
     }
+  syntax->suspect = false;
   switch (action)
     {
     case '+':
@@ -449,134 +417,169 @@ m4_set_syntax (m4_syntax_table *syntax, char key, char 
action,
     default:
       assert (false);
     }
-  set_quote_age (syntax, false, true);
-  m4__quote_uncache (syntax);
-  return code;
-}
 
-static bool
-check_is_single_quotes (m4_syntax_table *syntax)
-{
-  int ch;
-  int lquote = -1;
-  int rquote = -1;
-
-  if (! syntax->is_single_quotes)
-    return false;
-  assert (syntax->quote.len1 == 1 && syntax->quote.len2 == 1);
-
-  if (m4_has_syntax (syntax, *syntax->quote.str1, M4_SYNTAX_LQUOTE)
-      && m4_has_syntax (syntax, *syntax->quote.str2, M4_SYNTAX_RQUOTE))
-    return true;
-
-  /* The most recent action invalidated our current lquote/rquote.  If
-     we still have exactly one character performing those roles based
-     on the syntax table, then update lquote/rquote accordingly.
-     Otherwise, keep lquote/rquote, but we no longer have single
-     quotes.  */
-  for (ch = UCHAR_MAX + 1; --ch >= 0; )
+  /* Check for any cleanup needed.  */
+  if (syntax->suspect)
     {
-      if (m4_has_syntax (syntax, ch, M4_SYNTAX_LQUOTE))
+      int ch;
+      int lquote = -1;
+      int rquote = -1;
+      int bcomm = -1;
+      int ecomm = -1;
+      bool single_quote_possible = true;
+      bool single_comm_possible = true;
+      if (m4_has_syntax (syntax, syntax->quote.str1[0], M4_SYNTAX_LQUOTE))
        {
-         if (lquote == -1)
-           lquote = ch;
-         else
+         assert (syntax->quote.len1 == 1);
+         lquote = to_uchar (syntax->quote.str1[0]);
+       }
+      if (m4_has_syntax (syntax, syntax->quote.str2[0], M4_SYNTAX_RQUOTE))
+       {
+         assert (syntax->quote.len2 == 1);
+         rquote = to_uchar (syntax->quote.str2[0]);
+       }
+      if (m4_has_syntax (syntax, syntax->comm.str1[0], M4_SYNTAX_BCOMM))
+       {
+         assert (syntax->comm.len1 == 1);
+         bcomm = to_uchar (syntax->comm.str1[0]);
+       }
+      if (m4_has_syntax (syntax, syntax->comm.str2[0], M4_SYNTAX_ECOMM))
+       {
+         assert (syntax->comm.len2 == 1);
+         ecomm = to_uchar (syntax->comm.str2[0]);
+       }
+      syntax->is_macro_escaped = false;
+      /* Find candidates for each category.  */
+      for (ch = UCHAR_MAX + 1; --ch >= 0; )
+       {
+         if (m4_has_syntax (syntax, ch, M4_SYNTAX_LQUOTE))
            {
-             syntax->is_single_quotes = false;
-             break;
+             if (lquote == -1)
+               lquote = ch;
+             else if (lquote != ch)
+               single_quote_possible = false;
            }
+         if (m4_has_syntax (syntax, ch, M4_SYNTAX_RQUOTE))
+           {
+             if (rquote == -1)
+               rquote = ch;
+             else if (rquote != ch)
+               single_quote_possible = false;
+           }
+         if (m4_has_syntax (syntax, ch, M4_SYNTAX_BCOMM))
+           {
+             if (bcomm == -1)
+               bcomm = ch;
+             else if (bcomm != ch)
+               single_comm_possible = false;
+           }
+         if (m4_has_syntax (syntax, ch, M4_SYNTAX_ECOMM))
+           {
+             if (ecomm == -1)
+               ecomm = ch;
+             else if (ecomm != ch)
+               single_comm_possible = false;
+           }
+         if (m4_has_syntax (syntax, ch, M4_SYNTAX_ESCAPE))
+           syntax->is_macro_escaped = true;
        }
-      if (m4_has_syntax (syntax, ch, M4_SYNTAX_RQUOTE))
+      /* Disable multi-character delimiters if we discovered
+        delimiters.  */
+      if (!single_quote_possible)
+       syntax->is_single_quotes = false;
+      if (!single_comm_possible)
+       syntax->is_single_comments = false;
+      if ((1 < syntax->quote.len1 || 1 < syntax->quote.len2)
+         && (!syntax->is_single_quotes || lquote != -1 || rquote != -1))
        {
-         if (rquote == -1)
-           rquote = ch;
-         else
+         if (syntax->quote.len1)
+           {
+             syntax->quote.len1 = lquote == to_uchar (syntax->quote.str1[0]);
+             syntax->quote.str1[syntax->quote.len1] = '\0';
+           }
+         if (syntax->quote.len2)
            {
-             syntax->is_single_quotes = false;
-             break;
+             syntax->quote.len2 = rquote == to_uchar (syntax->quote.str2[0]);
+             syntax->quote.str2[syntax->quote.len2] = '\0';
            }
        }
-    }
-  if (lquote == -1 || rquote == -1)
-    syntax->is_single_quotes = false;
-  else if (syntax->is_single_quotes)
-    {
-      *syntax->quote.str1 = lquote;
-      *syntax->quote.str2 = rquote;
-    }
-  return syntax->is_single_quotes;
-}
-
-static bool
-check_is_single_comments (m4_syntax_table *syntax)
-{
-  int ch;
-  int bcomm = -1;
-  int ecomm = -1;
-
-  if (! syntax->is_single_comments)
-    return false;
-  assert (syntax->comm.len1 == 1 && syntax->comm.len2 == 1);
-
-  if (m4_has_syntax (syntax, *syntax->comm.str1, M4_SYNTAX_BCOMM)
-      && m4_has_syntax (syntax, *syntax->comm.str2, M4_SYNTAX_ECOMM))
-    return true;
-
-  /* The most recent action invalidated our current bcomm/ecomm.  If
-     we still have exactly one character performing those roles based
-     on the syntax table, then update bcomm/ecomm accordingly.
-     Otherwise, keep bcomm/ecomm, but we no longer have single
-     comments.  */
-  for (ch = UCHAR_MAX + 1; --ch >= 0; )
-    {
-      if (m4_has_syntax (syntax, ch, M4_SYNTAX_BCOMM))
+      if ((1 < syntax->comm.len1 || 1 < syntax->comm.len2)
+         && (!syntax->is_single_comments || bcomm != -1 || ecomm != -1))
        {
-         if (bcomm == -1)
-           bcomm = ch;
+         if (syntax->comm.len1)
+           {
+             syntax->comm.len1 = bcomm == to_uchar (syntax->comm.str1[0]);
+             syntax->comm.str1[syntax->comm.len1] = '\0';
+           }
+         if (syntax->comm.len2)
+           {
+             syntax->comm.len2 = ecomm == to_uchar (syntax->comm.str2[0]);
+             syntax->comm.str2[syntax->comm.len2] = '\0';
+           }
+       }
+      /* Update the strings.  */
+      if (lquote != -1)
+       {
+         if (single_quote_possible)
+           syntax->is_single_quotes = true;
+         if (syntax->quote.len1)
+           assert (syntax->quote.len1 == 1);
          else
            {
-             syntax->is_single_comments = false;
-             break;
+             free (syntax->quote.str1);
+             syntax->quote.str1 = xcharalloc (2);
+             syntax->quote.str1[1] = '\0';
+             syntax->quote.len1 = 1;
+           }
+         syntax->quote.str1[0] = lquote;
+         if (rquote == -1)
+           {
+             rquote = '\'';
+             add_syntax_attribute (syntax, rquote, M4_SYNTAX_RQUOTE);
+           }
+         if (!syntax->quote.len2)
+           {
+             free (syntax->quote.str2);
+             syntax->quote.str2 = xcharalloc (2);
            }
+         syntax->quote.str2[0] = rquote;
+         syntax->quote.str2[1] = '\0';
+         syntax->quote.len2 = 1;
        }
-      if (m4_has_syntax (syntax, ch, M4_SYNTAX_ECOMM))
+      if (bcomm != -1)
        {
-         if (ecomm == -1)
-           ecomm = ch;
+         if (single_comm_possible)
+           syntax->is_single_comments = true;
+         if (syntax->comm.len1)
+           assert (syntax->comm.len1 == 1);
          else
            {
-             syntax->is_single_comments = false;
-             break;
+             free (syntax->comm.str1);
+             syntax->comm.str1 = xcharalloc (2);
+             syntax->comm.str1[1] = '\0';
+             syntax->comm.len1 = 1;
+           }
+         syntax->comm.str1[0] = bcomm;
+         if (ecomm == -1)
+           {
+             ecomm = '\n';
+             add_syntax_attribute (syntax, ecomm, M4_SYNTAX_ECOMM);
+           }
+         if (!syntax->comm.len2)
+           {
+             free (syntax->comm.str2);
+             syntax->comm.str2 = xcharalloc (2);
            }
+         syntax->comm.str2[0] = ecomm;
+         syntax->comm.str2[1] = '\0';
+         syntax->comm.len2 = 1;
        }
     }
-  if (bcomm == -1 || ecomm == -1)
-    syntax->is_single_comments = false;
-  else if (syntax->is_single_comments)
-    {
-      *syntax->comm.str1 = bcomm;
-      *syntax->comm.str2 = ecomm;
-    }
-  return syntax->is_single_comments;
-}
-
-static bool
-check_is_macro_escaped (m4_syntax_table *syntax)
-{
-  int ch;
-
-  syntax->is_macro_escaped = false;
-  for (ch = UCHAR_MAX + 1; --ch >= 0; )
-    if (m4_has_syntax (syntax, ch, M4_SYNTAX_ESCAPE))
-      {
-       syntax->is_macro_escaped = true;
-       break;
-      }
-
-  return syntax->is_macro_escaped;
+  set_quote_age (syntax, false, true);
+  m4__quote_uncache (syntax);
+  return code;
 }
 
-
 
 /* Functions for setting quotes and comment delimiters.  Used by
    m4_changecom () and m4_changequote ().  Both functions override the
@@ -629,13 +632,7 @@ m4_set_quotes (m4_syntax_table *syntax, const char *lq, 
size_t lq_len,
   /* changequote overrides syntax_table, but be careful when it is
      used to select a start-quote sequence that is effectively
      disabled.  */
-
-  syntax->is_single_quotes
-    = (syntax->quote.len1 == 1 && syntax->quote.len2 == 1
-       && !m4_has_syntax (syntax, *syntax->quote.str1,
-                         (M4_SYNTAX_IGNORE | M4_SYNTAX_ESCAPE
-                          | M4_SYNTAX_ALPHA | M4_SYNTAX_NUM)));
-
+  syntax->is_single_quotes = true;
   for (ch = UCHAR_MAX + 1; --ch >= 0; )
     {
       if (m4_has_syntax (syntax, ch, M4_SYNTAX_LQUOTE))
@@ -646,15 +643,15 @@ m4_set_quotes (m4_syntax_table *syntax, const char *lq, 
size_t lq_len,
        remove_syntax_attribute (syntax, ch, M4_SYNTAX_RQUOTE);
     }
 
-  if (syntax->is_single_quotes)
+  if (!m4_has_syntax (syntax, *syntax->quote.str1,
+                     (M4_SYNTAX_IGNORE | M4_SYNTAX_ESCAPE | M4_SYNTAX_ALPHA
+                      | M4_SYNTAX_NUM)))
     {
-      add_syntax_attribute (syntax, to_uchar (syntax->quote.str1[0]),
-                           M4_SYNTAX_LQUOTE);
-      add_syntax_attribute (syntax, to_uchar (syntax->quote.str2[0]),
-                           M4_SYNTAX_RQUOTE);
+      if (syntax->quote.len1 == 1)
+       add_syntax_attribute (syntax, syntax->quote.str1[0], M4_SYNTAX_LQUOTE);
+      if (syntax->quote.len2 == 1)
+       add_syntax_attribute (syntax, syntax->quote.str2[0], M4_SYNTAX_RQUOTE);
     }
-  if (syntax->is_macro_escaped)
-    check_is_macro_escaped (syntax);
   set_quote_age (syntax, false, false);
 }
 
@@ -703,14 +700,7 @@ m4_set_comment (m4_syntax_table *syntax, const char *bc, 
size_t bc_len,
   /* changecom overrides syntax_table, but be careful when it is used
      to select a start-comment sequence that is effectively
      disabled.  */
-
-  syntax->is_single_comments
-    = (syntax->comm.len1 == 1 && syntax->comm.len2 == 1
-       && !m4_has_syntax (syntax, *syntax->comm.str1,
-                         (M4_SYNTAX_IGNORE | M4_SYNTAX_ESCAPE
-                          | M4_SYNTAX_ALPHA | M4_SYNTAX_NUM
-                          | M4_SYNTAX_LQUOTE)));
-
+  syntax->is_single_comments = true;
   for (ch = UCHAR_MAX + 1; --ch >= 0; )
     {
       if (m4_has_syntax (syntax, ch, M4_SYNTAX_BCOMM))
@@ -720,20 +710,20 @@ m4_set_comment (m4_syntax_table *syntax, const char *bc, 
size_t bc_len,
       if (m4_has_syntax (syntax, ch, M4_SYNTAX_ECOMM))
        remove_syntax_attribute (syntax, ch, M4_SYNTAX_ECOMM);
     }
-  if (syntax->is_single_comments)
+  if (!m4_has_syntax (syntax, *syntax->comm.str1,
+                     (M4_SYNTAX_IGNORE | M4_SYNTAX_ESCAPE | M4_SYNTAX_ALPHA
+                      | M4_SYNTAX_NUM | M4_SYNTAX_LQUOTE)))
     {
-      add_syntax_attribute (syntax, to_uchar (syntax->comm.str1[0]),
-                           M4_SYNTAX_BCOMM);
-      add_syntax_attribute (syntax, to_uchar (syntax->comm.str2[0]),
-                           M4_SYNTAX_ECOMM);
+      if (syntax->comm.len1 == 1)
+       add_syntax_attribute (syntax, syntax->comm.str1[0], M4_SYNTAX_BCOMM);
+      if (syntax->comm.len2 == 1)
+       add_syntax_attribute (syntax, syntax->comm.str2[0], M4_SYNTAX_ECOMM);
     }
-  if (syntax->is_macro_escaped)
-    check_is_macro_escaped (syntax);
   set_quote_age (syntax, false, false);
 }
 
 /* Call this when changing anything that might impact the quote age,
-   so that m4_quote_age and m4_safe_quotes will reflect the change.
+   so that m4__quote_age and m4__safe_quotes will reflect the change.
    If RESET, changesyntax was reset to its default stage; if CHANGE,
    arbitrary syntax has changed; otherwise, just quotes or comment
    delimiters have changed.  */
@@ -789,6 +779,7 @@ set_quote_age (m4_syntax_table *syntax, bool reset, bool 
change)
   else
     local_syntax_age = syntax->syntax_age;
   if (local_syntax_age < 0xffff && syntax->is_single_quotes
+      && syntax->quote.len1 == 1 && syntax->quote.len2 == 1
       && !m4_has_syntax (syntax, *syntax->quote.str1,
                         (M4_SYNTAX_ALPHA | M4_SYNTAX_NUM | M4_SYNTAX_OPEN
                          | M4_SYNTAX_COMMA | M4_SYNTAX_CLOSE


hooks/post-receive
--
GNU M4 source repository
[Prev in Thread]
Current Thread
[Next in Thread]
[SCM] GNU M4 source repository branch, master, updated. cvs-readonly-198-g047d480, Eric Blake <=
Prev by Date: [SCM] GNU M4 source repository branch, branch-1.6, updated. v1.5.89a-94-g1c206fc
Next by Date: [SCM] GNU M4 source repository branch, branch-1.6, updated. v1.5.89a-95-geeddccf
Previous by thread: [SCM] GNU M4 source repository branch, branch-1.6, updated. v1.5.89a-94-g1c206fc
Next by thread: [SCM] GNU M4 source repository branch, branch-1.6, updated. v1.5.89a-95-geeddccf
Index(es):
- Date
- Thread