bug-readline
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: rl_filename_rewrite_hook called on wrong filename


From: Stefan H. Holek
Subject: Re: rl_filename_rewrite_hook called on wrong filename
Date: Mon, 25 Sep 2023 18:35:53 +0200

Hi Grisha,

Thank you for the detailed reply. I was not aware of the paste and glob-complete-word issues. OMG!

Background: I maintain Readline bindings for Python [1] which I have started long ago because I was annoyed the standard bindings do not allow for proper filename completion. My goal is to also provide good documentation [2] and examples [3].

The scenario I have given is hypothetical in so far as I do not myself maintain an application with these requirements. I have however tested it in the past by using a Latin-1 terminal on a UTF-8 filesystem, and completion has worked just fine (given all three hooks are set).

I am however not sure supporting different character sets is still useful in 2023. Everything is UTF-8 these days, and your code is fine if NFD/NFC is the only problem we have to solve. Yet another hook? I’d rather not...

I will have to run some experiments.

Thanks again for your feedback,
Stefan



On 25. Sep 2023, at 08:15, Grisha Levit <grishalevit@gmail.com> wrote:

Hi Stefan,

On Sun, Sep 24, 2023, 06:38 Stefan H. Holek <stefan@epy.co.at> wrote:
Hi All,

There appears to be an issue with a recent addition to rl_filename_completion_function. It now applies rl_filename_rewrite_hook to the filename part of "what the user has typed". This seems wrong. Let me explain.

This was my patch (submitted to bug-bash, the original thread is at [1]) so I'll defend the motivation for it -- though I think you're right that the implementation was too narrowly focused on addressing the issue described there and can violate assumptions in existing code.

The rl_filename_rewrite_hook exists to convert data read from the filesystem to a representation that works in the terminal. E.g. on macOS the filesystem returns decomposed UTF-8, which must be converted to fully composed UTF-8 before comparing it to a string the user has typed.

Side note: APFS preserves normalization -- so we get both composed and decomposed entries to compare against.  But that doesn't really affect this feature.

For background, with either filesystem, macOS filenames are not the usual opaque byte strings that they are on other platforms but rather normalization-insensitive UTF-8 text, i.e.:
* it's not possible to have two distinct directory entries that normalize equally
* a file can be accessed using any name that normalizes the same as the filename

Now, the section below (in complete.c) appears to apply rl_filename_rewrite_hook to a string in TERMINAL representation ('filename' is the rightmost part of the path the user has typed):

While text literally typed in will likely be NFC, any filenames pasted into the terminal (or placed there by glob-complete-word, etc) will retain the normalization stored on the filesystem -- which is usually _not_ NFC. See examples in the thread at [2].

I struggle to find this useful and in fact think it's dangerous and should be backed out.

So without normalizing the input text, it's not possible to reuse filenames read from the filesystem (`ls` output, etc.) as input to readline completion code.  Or rather, it would be possible, but Readline normalizes the directory entries so it only makes sense to normalize the text to match against them as well.

If I have an rl_filename_rewrite_hook that works in Readline 8.2, it may just fail in 8.3 because it is applied to a string that is not in the expected filesystem representation!

Readline has so far worked fine in scenarios where the terminal encoding differs from the filesystem encoding. I can use rl_directory_rewrite_hook and rl_filename_stat_hook to go from terminal -> filesystem encoding, and rl_filename_rewrite_hook to go from filesystem -> terminal encoding. It is my understanding that these hooks have been added to support this use-case in the first place.

Is this an existing application or a hypothetical one? I'm not sure how this can work as described -- rl_directory_rewrite_hook only modifies the directory portion of the text, not the part after the final slash, and rl_filename_stat_hook is applied only after completion matches have already been generated.

What was missing was a way to modify the filename portion of the text before generating completion matches. (Well, rl_filename_dequoting_function does that, but that only gets called if the name is quoted).

Maybe a better approach is a separate variable (e.g.  rl_filename_completion_hook) to serve this purpose since an application may want to perform different transformations on generated filenames vs input text.


-- 
Stefan H. Holek


reply via email to

[Prev in Thread] Current Thread [Next in Thread]