[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#21670: surprising bug in grep -e with anchors
From: |
Jim Meyering |
Subject: |
bug#21670: surprising bug in grep -e with anchors |
Date: |
Mon, 12 Oct 2015 15:17:42 -0700 |
On Sun, Oct 11, 2015 at 9:34 PM, Paul Eggert <address@hidden> wrote:
> greg boyd wrote:
>>
>> test case (single line)
>> abchelloabc
>>
>> grep does not find the line with grep -e '^hello' nor with grep -e
>> 'hello$'
>> however, the line is output with
>> grep -e '^hello' -e 'hello$'
>
>
> Oooo, that's a good one. Give your student extra credit! As it happens,
> the bug was recently fixed by this patch by Norihiro Tanaka:
>
> http://git.savannah.gnu.org/cgit/grep.git/commit/?id=256a4b494fe1c48083ba73b4f62607234e4fefd5
>
> and the fix should appear in the next grep release. However, since the
> patch was supposed to affect only performance, it appears that the bug fix
> was due to luck, and I'm taking the liberty of adding your student's test
> case by installing the attached further patch, to help prevent this bug from
> coming back in a future version.
Thanks for adding that test, Paul.
However, note that the bug does not require two uses of "-e" per-se.
Multiple "-e"-specified regexps get translated internally to those regexps
separated by the ERE "|" alternation/"or" operator. A smaller, perhaps
more illustrative test case is to use an explicit "|":
$ echo axa | grep -E '^x|x$'
axa
FYI, one can demonstrate that it was a problem in the DFA
matcher without resorting to gdb by inserting a "()" in the ERE,
since that construct cannot work in a DFA and grep resorts
to using glibc's full-blown regex matcher. With that, even the
afflicted versions of grep get the desired result (no match):
$ echo axa | grep -E '^x()|x$'; echo $?
$ 1