[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 1/3] sed: Fix infinite loop on some false multi-byte matches
From: |
Stanislav Brabec |
Subject: |
Re: [PATCH 1/3] sed: Fix infinite loop on some false multi-byte matches |
Date: |
Wed, 15 Feb 2012 17:29:28 +0100 |
Roland McGrath wrote:
>A subtle issue such as this warrants an addition to the test
>suite.
Aharon Robbins wrote:
> I have been looking at this and trying to see if I can reproduce
> it in gawk. I can't seem too. Would someone who understands the
> issue supply me with a test awk program that either shows that
> gawk has this bug, or doesn't?
PATCH 2/3 contains sed testcase that can easily reproduce the bug in
sed. (The last line contains testcase for another bug that appeared in
older versions of glibc.)
However I tried hard to minimize the testcase, I failed to reproduce it
outside sed. Here is my best attempt C testcase, but it _does_not_
reproduce the problem. Probably there are some additional conditions
that are fulfilled in sed, but not here:
/* Test re_search with multi-byte characters in EUC-JP.
Copyright (C) 2006 Free Software Foundation, Inc.
This file is part of the GNU C Library.
Contributed by Stanislav Brabec <address@hidden>, 2012.
The GNU C Library is free software; you can redistribute it and/or
modify it under the terms of the GNU Lesser General Public
License as published by the Free Software Foundation; either
version 2.1 of the License, or (at your option) any later version.
The GNU C Library is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
Lesser General Public License for more details.
You should have received a copy of the GNU Lesser General Public
License along with the GNU C Library; if not, write to the Free
Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA
02111-1307 USA. */
#define _GNU_SOURCE 1
#include <locale.h>
#include <regex.h>
#include <stdio.h>
#include <string.h>
int
main (void)
{
struct re_pattern_buffer r;
struct re_registers s;
int e, rc = 0;
if (setlocale (LC_CTYPE, "ja_JP.EUC-JP") == NULL)
{
puts ("setlocale failed");
return 1;
}
memset (&r, 0, sizeof (r));
memset (&s, 0, sizeof (s));
re_set_syntax (RE_SYNTAX_POSIX_BASIC | RE_NO_POSIX_BACKTRACKING);
/* 圭 */
re_compile_pattern ("\xb7\xbd", 2, &r);
r.regs_allocated = REGS_REALLOCATE;
/* aaaaa件a新処, \xb7\xbd constitutes a false match */
e = re_search (&r, "\x61\x61\x61\x61\x61\xb7\xef\x61\xbf\xb7\xbd\xe8",
12, 0, 12, &s);
if (e != -1)
{
printf ("bug-regex33.1: false match or error: re_search() returned %d\n",
e);
rc = 1;
}
/* aaaa件a新処, \xb7\xbd constitutes a false match */
e = re_search (&r, "\x61\x61\x61\x61\xb7\xef\x61\xbf\xb7\xbd\xe8",
11, 0, 11, &s);
if (e != -1)
{
printf ("bug-regex33.2: false match or error: re_search() returned %d\n",
e);
rc = 1;
}
/* aaa件a新処, \xb7\xbd constitutes a false match */
e = re_search (&r, "\x61\x61\x61\xb7\xef\x61\xbf\xb7\xbd\xe8",
10, 0, 10, &s);
if (e != -1)
{
printf ("bug-regex33.3: false match or error: re_search() returned %d\n",
e);
rc = 1;
}
/* aa件a新処, \xb7\xbd constitutes a false match */
e = re_search (&r, "\x61\x61\xb7\xef\x61\xbf\xb7\xbd\xe8",
9, 0, 9, &s);
if (e != -1)
{
printf ("bug-regex33.4: false match or error: re_search() returned %d\n",
e);
rc = 1;
}
/* a件a新処, \xb7\xbd constitutes a false match */
e = re_search (&r, "\x61\xb7\xef\x61\xbf\xb7\xbd\xe8",
8, 0, 8, &s);
if (e != -1)
{
printf ("bug-regex33.5: false match or error: re_search() returned %d\n",
e);
rc = 1;
}
/* 新処圭新処, \xb7\xbd here really matches 圭 */
e = re_search (&r, "\xbf\xb7\xbd\xe8\xb7\xbd\xbf\xb7\xbd\xe8",
10, 0, 10, &s);
if (e != 4)
{
printf ("bug-regex33.6: match not found: re_search() returned %d\n", e);
rc = 1;
}
return rc;
}
--
Best Regards / S pozdravem,
Stanislav Brabec
software developer
---------------------------------------------------------------------
SUSE LINUX, s. r. o. e-mail: address@hidden
Lihovarská 1060/12 tel: +49 911 7405384547
190 00 Praha 9 fax: +420 284 028 951
Czech Republic http://www.suse.cz/