[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[SCM] GNU M4 source repository branch, branch-1.6, updated. v1.5.89a-45-
From: |
Eric Blake |
Subject: |
[SCM] GNU M4 source repository branch, branch-1.6, updated. v1.5.89a-45-g0d6fb01 |
Date: |
Mon, 04 Aug 2008 00:45:58 +0000 |
This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "GNU M4 source repository".
http://git.sv.gnu.org/gitweb/?p=m4.git;a=commitdiff;h=0d6fb01e76bc35550a00cbf7710d1471db9e7b00
The branch, branch-1.6 has been updated
via 0d6fb01e76bc35550a00cbf7710d1471db9e7b00 (commit)
from c9d53ab9bcef0cb04d59f5797e6f20159150b75d (commit)
Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.
- Log -----------------------------------------------------------------
commit 0d6fb01e76bc35550a00cbf7710d1471db9e7b00
Author: Eric Blake <address@hidden>
Date: Sun Aug 3 14:23:19 2008 -0600
Fix regression in commenting unbalanced quotes, from 2008-02-16.
* src/m4.h (enum token_type): Add TOKEN_COMMENT.
* src/input.c (next_token, peek_token, token_type_string)
(print_token): Supply new token type for comments.
* src/macro.c (expand_token): Remove penalty for unquoted `-'
bytes. Penalize comments, as they can contain unbalanced quotes;
latent bug since 2007-12-07, exposed by passing $@ references
built from comments.
(expand_argument): Adjust caller.
* doc/m4.texinfo (Comments): Test the fix.
* NEWS: Mention the fix.
Signed-off-by: Eric Blake <address@hidden>
-----------------------------------------------------------------------
Summary of changes:
ChangeLog | 14 ++++++++++++++
NEWS | 14 +++++++-------
doc/m4.texinfo | 21 +++++++++++++++++++++
src/input.c | 36 ++++++++++++++++++++++--------------
src/m4.h | 3 ++-
src/macro.c | 49 +++++++++++++++++++++++++++++++------------------
6 files changed, 97 insertions(+), 40 deletions(-)
diff --git a/ChangeLog b/ChangeLog
index d4f182e..325bf7a 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,17 @@
+2008-08-03 Eric Blake <address@hidden>
+
+ Fix regression in commenting unbalanced quotes, from 2008-02-16.
+ * src/m4.h (enum token_type): Add TOKEN_COMMENT.
+ * src/input.c (next_token, peek_token, token_type_string)
+ (print_token): Supply new token type for comments.
+ * src/macro.c (expand_token): Remove penalty for unquoted `-'
+ bytes. Penalize comments, as they can contain unbalanced quotes;
+ latent bug since 2007-12-07, exposed by passing $@ references
+ built from comments.
+ (expand_argument): Adjust caller.
+ * doc/m4.texinfo (Comments): Test the fix.
+ * NEWS: Mention the fix.
+
2008-07-30 Eric Blake <address@hidden>
Fix regression in trace output, introduced 2008-05-09.
diff --git a/NEWS b/NEWS
index bea5f07..fe6f9e8 100644
--- a/NEWS
+++ b/NEWS
@@ -9,13 +9,13 @@ Foundation, Inc.
a macro. This was most noticeable with `traceon(`traceon')', but
would also happen in cases such as `foo(traceon(`foo'))'.
-** Fix regression introduced in 1.4.10b (but not present in 1.4.11) where
- using `builtin' or `indir' to perform nested `shift' calls triggered an
- assertion failure.
-
-** Fix regression introduced in 1.4.10b (but not present in 1.4.11) where
- the command-line option -dV, as well as the builtin `debugmode(V)',
- failed to enable `t' and `c' debug options.
+** Fix regressions introduced in 1.4.10b but not present in 1.4.11:
+*** Using `builtin' or `indir' to perform nested `shift' calls triggered
+ an assertion failure.
+*** The command-line option -dV, as well as the builtin `debugmode(V)',
+ failed to enable `t' and `c' debug options.
+*** Comments that contain unbalanced quotes were not rescanned correctly
+ when passed through address@hidden
** Fix the `m4wrap' builtin to accumulate wrapped text in FIFO order, as
required by POSIX. The manual mentions a way to restore the LIFO order
diff --git a/doc/m4.texinfo b/doc/m4.texinfo
index abacef9..d8e2625 100644
--- a/doc/m4.texinfo
+++ b/doc/m4.texinfo
@@ -1038,6 +1038,27 @@ The comment delimiters can be changed to any string at
any time, using
the builtin macro @code{changecom}. @xref{Changecom}, for more
information.
address@hidden
address@hidden Detect regression in 1.4.10b in regards to reparsing comments.
address@hidden Not worth including in the manual.
address@hidden
+define(`e', `$@@')define(`q', ``$@@'')define(`foo', `bar')
address@hidden
+q(e(`one
+',#two ' foo
+))
address@hidden
address@hidden',`#two bar
address@hidden''
+changecom(`<', `>')define(`n', `$#')
address@hidden
+n(e(<`>, <'>))
address@hidden
+len(e(<`>, ,<'>))
address@hidden
address@hidden example
address@hidden ignore
+
@node Other tokens
@section Other kinds of input tokens
diff --git a/src/input.c b/src/input.c
index 0d08215..4f969b7 100644
--- a/src/input.c
+++ b/src/input.c
@@ -1590,17 +1590,18 @@ quote_cache (struct obstack *obs, unsigned int age,
const string_pair *quotes)
/*--------------------------------------------------------------------.
| Parse a single token from the input stream, set TD to its |
| contents, and return its type. A token is TOKEN_EOF if the |
-| input_stack is empty; TOKEN_STRING for a quoted string or comment; |
-| TOKEN_WORD for something that is a potential macro name; and |
-| TOKEN_SIMPLE for any single character that is not a part of any of |
-| the previous types. If LINE is not NULL, set *LINE to the line |
-| where the token starts. If OBS is not NULL, expand TOKEN_STRING |
-| directly into OBS rather than in token_stack temporary storage |
-| area, and TD could be a TOKEN_COMP instead of the usual |
-| TOKEN_TEXT. If ALLOW_ARGV, OBS must be non-NULL, and an entire |
-| series of arguments can be returned as TOKEN_ARGV when a $@ |
-| reference is encountered. Report errors (unterminated comments or |
-| strings) on behalf of CALLER, if non-NULL. |
+| input_stack is empty; TOKEN_STRING for a quoted string; |
+| TOKEN_COMMENT for a comment; TOKEN_WORD for something that is a |
+| potential macro name; and TOKEN_SIMPLE for any single character |
+| that is not a part of any of the previous types. If LINE is not |
+| NULL, set *LINE to the line where the token starts. If OBS is not |
+| NULL, expand TOKEN_STRING and TOKEN_COMMENT directly into OBS |
+| rather than in token_stack temporary storage area, and TD could be |
+| a TOKEN_COMP instead of the usual TOKEN_TEXT. If ALLOW_ARGV, OBS |
+| must be non-NULL, and an entire series of arguments can be |
+| returned as TOKEN_ARGV when a $@ reference is encountered. Report |
+| errors (unterminated comments or strings) on behalf of CALLER, if |
+| non-NULL. |
| |
| Next_token () returns the token type, and passes back a pointer to |
| the token data through TD. Non-string token text is collected on |
@@ -1695,7 +1696,7 @@ next_token (token_data *td, int *line, struct obstack
*obs, bool allow_argv,
assert (ch < CHAR_EOF);
obstack_1grow (obs_td, ch);
}
- type = TOKEN_STRING;
+ type = TOKEN_COMMENT;
}
else if (default_word_regexp && (isalpha (ch) || ch == '_'))
{
@@ -1837,7 +1838,8 @@ next_token (token_data *td, int *line, struct obstack
*obs, bool allow_argv,
}
else
{
- assert (TOKEN_DATA_TYPE (td) == TOKEN_COMP && type == TOKEN_STRING);
+ assert (TOKEN_DATA_TYPE (td) == TOKEN_COMP
+ && (type == TOKEN_STRING || type == TOKEN_COMMENT));
#ifdef DEBUG_INPUT
{
token_chain *chain;
@@ -1895,7 +1897,7 @@ peek_token (void)
}
else if (MATCH (ch, curr_comm.str1, curr_comm.len1, false))
{
- result = TOKEN_STRING;
+ result = TOKEN_COMMENT;
}
else if ((default_word_regexp && (isalpha (ch) || ch == '_'))
#ifdef ENABLE_CHANGEWORD
@@ -1943,6 +1945,8 @@ token_type_string (token_type t)
return "EOF";
case TOKEN_STRING:
return "STRING";
+ case TOKEN_COMMENT:
+ return "COMMENT";
case TOKEN_WORD:
return "WORD";
case TOKEN_OPEN:
@@ -1981,6 +1985,10 @@ print_token (const char *s, token_type t, token_data *td)
xfprintf (stderr, "string:");
break;
+ case TOKEN_COMMENT:
+ xfprintf (stderr, "comment:");
+ break;
+
case TOKEN_MACDEF:
xfprintf (stderr, "macro: %p\n", TOKEN_DATA_FUNC (td));
break;
diff --git a/src/m4.h b/src/m4.h
index ff0377a..40aa5ec 100644
--- a/src/m4.h
+++ b/src/m4.h
@@ -218,7 +218,8 @@ typedef struct token_chain token_chain;
enum token_type
{
TOKEN_EOF = 4,/* End of file, TOKEN_VOID. */
- TOKEN_STRING, /* Quoted string or comment, TOKEN_TEXT or TOKEN_COMP.
*/
+ TOKEN_STRING, /* Quoted string, TOKEN_TEXT or TOKEN_COMP. */
+ TOKEN_COMMENT,/* Comment, TOKEN_TEXT or TOKEN_COMP. */
TOKEN_WORD, /* An identifier, TOKEN_TEXT. */
TOKEN_OPEN, /* Active character `(', TOKEN_TEXT. */
TOKEN_COMMA, /* Active character `,', TOKEN_TEXT. */
diff --git a/src/macro.c b/src/macro.c
index 0b57436..9d8ffbb 100644
--- a/src/macro.c
+++ b/src/macro.c
@@ -260,8 +260,7 @@ expand_token (struct obstack *obs, token_type t, token_data
*td, int line,
bool first)
{
symbol *sym;
- bool result;
- int ch;
+ bool result = false;
switch (t)
{ /* TOKSW */
@@ -271,13 +270,19 @@ expand_token (struct obstack *obs, token_type t,
token_data *td, int line,
return true;
case TOKEN_STRING:
- /* Tokens and comments are safe in isolation (since quote_age()
- detects any change in delimiters). But if other text is
- already present, multi-character delimiters could be an
- issue, so use a conservative heuristic. If obstack is
- provided, the string was already expanded into it during
- next_token. */
+ /* Strings are safe in isolation (since quote_age() detects any
+ change in delimiters), or when safe_quotes is true. When
+ safe_quotes is false, we could technically return true if we
+ can prove that the concatenation of this string to prior text
+ does not form a multi-byte quote delimiter, but that is a lot
+ of overhead, so we give the conservative answer of false. */
result = first || safe_quotes ();
+ /* fallthru */
+ case TOKEN_COMMENT:
+ /* Comments can contain unbalanced quote delimiters. Rather
+ than search for one, we return the conservative answer of
+ false. If obstack is provided, the string or comment was
+ already expanded into it during next_token. */
if (obs)
return result;
break;
@@ -285,18 +290,23 @@ expand_token (struct obstack *obs, token_type t,
token_data *td, int line,
case TOKEN_OPEN:
case TOKEN_COMMA:
case TOKEN_CLOSE:
- /* Conservative heuristic; thanks to multi-character delimiter
- concatenation. */
+ /* If safe_quotes is true, then these do not form a quote
+ delimiter. If it is false, we give the conservative answer
+ of false rather than taking time to prove that no multi-byte
+ quote delimiter is formed. */
result = safe_quotes ();
break;
case TOKEN_SIMPLE:
- /* Conservative heuristic; if these characters are whitespace or
- numeric, then behavior of safe_quotes is applicable.
- Otherwise, assume these characters have a high likelihood of
- use in quote delimiters. */
- ch = to_uchar (*TOKEN_DATA_TEXT (td));
- result = (isspace (ch) || isdigit (ch)) && safe_quotes ();
+ /* If safe_quotes is true, then all but the single-byte end
+ quote delimiter is safe in a quoted context; a single-byte
+ start delimiter will trigger a TOKEN_STRING instead. If
+ safe_quotes is false, we give the conservative answer of
+ false rather than taking time to prove that no multi-byte
+ quote delimiter is formed. */
+ result = *TOKEN_DATA_TEXT (td) != *curr_quote.str2 && safe_quotes ();
+ if (result)
+ assert (*TOKEN_DATA_TEXT (td) != *curr_quote.str1);
break;
case TOKEN_WORD:
@@ -313,8 +323,10 @@ expand_token (struct obstack *obs, token_type t,
token_data *td, int line,
#else
divert_text (obs, TOKEN_DATA_TEXT (td), TOKEN_DATA_LEN (td), line);
#endif /* !ENABLE_CHANGEWORD */
- /* The word just appended is unquoted, but the heuristics of
- safe_quote are applicable. */
+ /* If safe_quotes is true, then words do not overlap with
+ quote delimiters. If it is false, we give the
+ conservative answer of false rather than prove that no
+ multi-byte delimiters are formed. */
return safe_quotes();
}
expand_macro (sym);
@@ -420,6 +432,7 @@ expand_argument (struct obstack *obs, token_data *argp,
case TOKEN_WORD:
case TOKEN_STRING:
+ case TOKEN_COMMENT:
case TOKEN_MACDEF:
if (!expand_token (obs, t, &td, line, first))
age = 0;
hooks/post-receive
--
GNU M4 source repository
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- [SCM] GNU M4 source repository branch, branch-1.6, updated. v1.5.89a-45-g0d6fb01,
Eric Blake <=