|
From: | Paul Eggert |
Subject: | bug#20526: grep BUG: text file is detected as binary |
Date: | Thu, 31 Dec 2015 01:29:35 -0800 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.4.0 |
Jim Meyering wrote:
The combination of this and the grep -oP infloop fix make this look like a good time for a bug-fix release. If there are any other pending bug fixes or small+safe changes people would like to see included, please let us know.
I have one major qualm about this: since 'grep' no longer checks whether the input is correctly encoded, I expect this may hurt -P performance significantly (though it may help non -P performance). This is because PCRE is slow at checking whether input data are valid UTF-8. I just now did a brief check and found one major performance issue:
grep -rP 'fed.*cba' .On my machine the above command is 125x slower with the new grep than the old one, which suggests some tuning is in order before releasing. (It's bogged down inside libpcre somewhere.)
Since you wrote your email I did a triage of the outstanding bugs, except for the bugs where patches are available which are mostly performance-related, and where I expect there will be some stuff that is relevant to -P slowdown.
[Prev in Thread] | Current Thread | [Next in Thread] |