bug-gnu-emacs
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#31796: 27.1; dired-do-find-regexp-and-replace fails to find multilin


From: Eli Zaretskii
Subject: bug#31796: 27.1; dired-do-find-regexp-and-replace fails to find multiline regexps
Date: Tue, 01 Dec 2020 17:46:19 +0200

> From: Richard Stallman <rms@gnu.org>
> Cc: eliz@gnu.org, abela@chalmers.se, 31796@debbugs.gnu.org
> Date: Tue, 01 Dec 2020 00:20:12 -0500
> 
> Can people think of a new feature that would be easy to add to GNU grep
> that would make it easy for Dired to handle all cases correctly?

Yes: it should detect encoding of each input file (and have a way of
letting the user specify encoding for each file), convert the file's
contents to some internal encoding (probably UTF-8), then report the
hits encoded in UTF-8, regardless of the file's original encoding (and
regardless of the current locale's codeset).

> I don't know what the problem is, but if it has to do with parsing the
> grep output, here's an idea: an option to tell GNU grep to use quoting
> on file names and the match strings, Perhaps in the same way GNU ls
> does.

The problem is not with file names, it's with the matches.  But since
you mention it: Grep should, in this new mode, report file names also
recoded into UTF-8.  In a word, it should arrange for its output be in
a single encoding known in advance, so that front ends like Emacs
won't need to guess the encoding.

> Another idea is an option to output numerical byte positions in the
> file instead of the lines that are matched.  Emacs can feed those byte
> positions into byte-to-position to convert them into buffer positions.

AFAIU, there's already such an option: -b.  However, byte-to-position
works only with UTF-8 encoded files; we need filepos-to-bufferpos
(which requires to know the file's encoding, so we are back at the
same problem).





reply via email to

[Prev in Thread] Current Thread [Next in Thread]