bug-grep
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] dfa: small fixes to single-byte range computation


From: Jim Meyering
Subject: Re: [PATCH] dfa: small fixes to single-byte range computation
Date: Mon, 30 Apr 2012 13:16:01 +0200

Paolo Bonzini wrote:
> * src/dfa.c (parse_bracket_exp): Do not call regexec with an invalid
> subject.  Move declarations before all statements.
> ---
>  src/dfa.c |   18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/src/dfa.c b/src/dfa.c
> index eefc817..a78e760 100644
> --- a/src/dfa.c
> +++ b/src/dfa.c
> @@ -1104,6 +1104,11 @@ parse_bracket_exp (void)
>              }
>            else
>              {
> +              /* Defer to the system regex library about the meaning
> +                 of range expressions.  */
> +              regex_t re;
> +              char pattern[6] = { '[', 0, '-', 0, ']', 0 };
> +              char subject[2] = { 0, 0 };
>                c1 = c;
>                if (case_fold)
>                  {
> @@ -1111,17 +1116,16 @@ parse_bracket_exp (void)
>                    c2 = tolower (c2);
>                  }
>
> -              /* Defer to the system regex library about the meaning
> -                 of range expressions.  */
> -              regex_t re;
> -              char pattern[6] = { '[', c1, '-', c2, ']', 0 };
> -              char subject[2] = { 0, 0 };
> +              pattern[1] = c1;
> +              pattern[3] = c2;
>                regcomp (&re, pattern, REG_NOSUB);
>                for (c = 0; c < NOTCHAR; ++c)
>                  {
> +                  if ((case_fold && isupper (c)) ||

Oops.  There should be no binary operator at the end
of a continued line.

> +                      (MB_CUR_MAX > 1 && btowc (c) == WEOF))
> +                    continue;
>                    subject[0] = c;
> -                  if (!(case_fold && isupper (c))
> -                      && regexec (&re, subject, 0, NULL, 0) != REG_NOMATCH)
> +                  if (regexec (&re, subject, 0, NULL, 0) != REG_NOMATCH)

Thanks.  That might even give a measurable performance improvement.

I've pushed this:

>From bd569cce36fabf79a4c9d43ccc835867be72f6ba Mon Sep 17 00:00:00 2001
From: Jim Meyering <address@hidden>
Date: Mon, 30 Apr 2012 09:56:52 +0200
Subject: [PATCH] cosmetic: binary operator goes *after* the newline, when
 split

* src/dfa.c (match_mb_charset): Join split lines.
(parse_bracket_exp): Move "||" from end of first split line
to the beginning of the continued line.
---
 src/dfa.c |    7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/src/dfa.c b/src/dfa.c
index a78e760..96d462f 100644
--- a/src/dfa.c
+++ b/src/dfa.c
@@ -1121,8 +1121,8 @@ parse_bracket_exp (void)
               regcomp (&re, pattern, REG_NOSUB);
               for (c = 0; c < NOTCHAR; ++c)
                 {
-                  if ((case_fold && isupper (c)) ||
-                      (MB_CUR_MAX > 1 && btowc (c) == WEOF))
+                  if ((case_fold && isupper (c))
+                      || (MB_CUR_MAX > 1 && btowc (c) == WEOF))
                     continue;
                   subject[0] = c;
                   if (regexec (&re, subject, 0, NULL, 0) != REG_NOMATCH)
@@ -3036,8 +3036,7 @@ match_mb_charset (struct dfa *d, state_num s, position 
pos, size_t idx)
   /* match with a range?  */
   for (i = 0; i < work_mbc->nranges; i++)
     {
-      if (work_mbc->range_sts[i] <= wc &&
-          wc <= work_mbc->range_ends[i])
+      if (work_mbc->range_sts[i] <= wc && wc <= work_mbc->range_ends[i])
         goto charset_matched;
     }

--
1.7.10.382.g62bc8



reply via email to

[Prev in Thread] Current Thread [Next in Thread]