[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#18266: handling bytes not part of the charset, and other garbage
From: |
Jim Meyering |
Subject: |
bug#18266: handling bytes not part of the charset, and other garbage |
Date: |
Fri, 12 Sep 2014 15:23:08 -0700 |
On Fri, Sep 12, 2014 at 2:39 PM, Paul Eggert <address@hidden> wrote:
> On 09/12/2014 02:29 PM, Vincent Lefevre wrote:
>
>> an option to control what happens on encoding errors would be better and
>> sufficient.
>
>
> It might suffice for your use cases, but it's more complicated and less
> flexible than being able to match bytes within the regular expression.
> (Plus, someone would have to implement it, which is perhaps the biggest
> objection to either approach ....) But I take your point that \C is best
> avoided. This whole area is pretty hairy, I'm afraid.
>
> Speaking of hairy, why doesn't grep use PCRE_MULTILINE? Using
> PCRE_MULTILINE shouldn't be that hard, and should boost performance quite a
> bit in typical usage. Or am I being too optimistic here?
When I first saw that implementation, I assumed it was just a first-cut one.
I see no reason not to use PCRE_MULTILINE, but haven't tried it, either.
- bug#18266: handling bytes not part of the charset, and other garbage (was: grep -P and invalid exits with error), (continued)
- bug#18266: handling bytes not part of the charset, and other garbage (was: grep -P and invalid exits with error), Vincent Lefevre, 2014/09/11
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/11
- bug#18266: handling bytes not part of the charset, and other garbage, Vincent Lefevre, 2014/09/11
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/11
- bug#18266: handling bytes not part of the charset, and other garbage, Vincent Lefevre, 2014/09/11
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/11
- bug#18266: handling bytes not part of the charset, and other garbage, Vincent Lefevre, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Vincent Lefevre, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage,
Jim Meyering <=
- bug#18266: handling bytes not part of the charset, and other garbage, Vincent Lefevre, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Vincent Lefevre, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/12
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/15
- bug#18266: handling bytes not part of the charset, and other garbage, Paul Eggert, 2014/09/16