[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [bug #13859] matching '([A]|[B]){2}' in different locales
From: |
Charles Levert |
Subject: |
Re: [bug #13859] matching '([A]|[B]){2}' in different locales |
Date: |
Wed, 20 Jul 2005 17:51:14 -0400 |
User-agent: |
Mutt/1.4.1i |
* On Wednesday 2005-07-20 at 17:28:39 +0000, anonymous wrote:
>
> Follow-up Comment #2, bug #13859 (project grep):
>
> Thank you, that version works correctly regardless of locale for me too. The
> problem seems to be worked around by your dfa_optional patch. If I keep all
> other patches, but unapply that, it starts failing for me again. After reading
> http://www.mail-archive.com/address@hidden/msg00068.html
This referenced messages states:
> The full story with this patch is that the grep built-in DFA actually
> pessimises rather than optimises when processing UTF-8. The idea of this
> patch is: if we are using UTF-8, disable grep's built-in DFA altogether.
>
> (The DFA in glibc handles UTF-8 correctly, and fast.)
This begs the question: what does grep's
built-in DFA actually provides (in terms of
performance, chiefly) once the regexp code has
been updated to glibc/gnulib's current version?
Does the answer depend on the chosen locale?
Is there still a justification for grep's DFA?