bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#31796: 27.1; dired-do-find-regexp-and-replace fails to find multilin


From: Dmitry Gutov
Subject: bug#31796: 27.1; dired-do-find-regexp-and-replace fails to find multiline regexps
Date: Wed, 2 Dec 2020 19:43:52 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.10.0

On 02.12.2020 19:39, Eli Zaretskii wrote:
Cc: abela@chalmers.se, 31796@debbugs.gnu.org
From: Dmitry Gutov <dgutov@yandex.ru>
Date: Wed, 2 Dec 2020 19:17:06 +0200

On 02.12.2020 16:56, Eli Zaretskii wrote:
The point is that our heuristics for detecting encoding is not
perfect, so it could fail.

Do you imagine Grep could use a more reliable detection algorithm?

No, I don't.  But it could allow the user to specify a different
encoding for each file, as in

    grep --encoding=FOO FILES1* --encoding=BAR FILES2*

Not sure we can call it like that in an automated fashion (i.e. in project-find-regexp). But hey, somebody else could.

etc.  And even if it just did the job of the same quality as we do, it
will do it faster, which is why we use Grep in the first place, right?

That's true.

The important part of the "enhancement" I described is actually the
fact that the output gets encoded in a single encoding, no matter what
was the encoding of the original files.  This makes reading and
decoding the output simple and always correct.

Yes, OK.

Although... since it has to scan the full file anyway, it could first do
a quick detection, and then maybe rescan from the beginning if the
encoding turns out to be something else.

That'd be too late, as some matches were already output.

It could buffer them until the full file has been parsed. Encoding detection and conversion must add a certain overhead anyway, so I'm not sure how expensive the extra buffering would be in comparison.

As a bonus, per-file buffering like that would allow easier parallelization of searches.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]