Re: Why does dired go through extra efforts to avoid unibyte names

emacs-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Why does dired go through extra efforts to avoid unibyte names

From:	Stefan Monnier
Subject:	Re: Why does dired go through extra efforts to avoid unibyte names
Date:	Tue, 02 Jan 2018 23:14:20 -0500
User-agent:	Gnus/5.13 (Gnus v5.13) Emacs/27.0.50 (gnu/linux)

>> I bumped into the following code in dired-get-filename:
>> 
>>        ;; The above `read' will return a unibyte string if FILE
>>        ;; contains eight-bit-control/graphic characters.
>>        (if (and enable-multibyte-characters
>>                 (not (multibyte-string-p file)))
>>            (setq file (string-to-multibyte file)))
>> 
>> and I'm wondering why we don't want a unibyte string here.
>> `vc-region-history` told me this comes from the commit appended below,
>> which seems to indicate that we're worried about a subsequent encoding,
>> but AFAIK unibyte file names are not (re)encoded, and passing them
>> through string-to-multibyte would actually make things worse in this
>> respect (since it might cause the kind of (re)encoding this is
>> supposedly trying to avoid).
>> 
>> What am I missing?
>
> Why does it matter whether eight-bit-* characters are encoded one more
> or one less time?

That's part of the question, indeed.

> As for the reason for using string-to-multibyte: maybe it's because we
> use concat further down in the function, which will determine whether
> the result will be unibyte or multibyte according to its own ideas of
> what's TRT?

But `concat` will do a string-to-multibyte for us, if needed, so
that doesn't seem like a good reason.

This said, when that code was written, maybe `concat` used
string-make-multibyte internally instead, so this call to
string-to-multibyte might have been added to avoid using
string-make-multibyte inside `concat`?

It would be good to have a concrete case that needed the above code, to
see if the problem still exists.


        Stefan

[Prev in Thread]

Current Thread

[Next in Thread]

Re: Why does dired go through extra efforts to avoid unibyte names, Stefan Monnier <=
- Re: Why does dired go through extra efforts to avoid unibyte names, Eli Zaretskii, 2018/01/03
  - Re: Why does dired go through extra efforts to avoid unibyte names, Stefan Monnier, 2018/01/03
    - Re: Why does dired go through extra efforts to avoid unibyte names, Eli Zaretskii, 2018/01/05
    - Re: Why does dired go through extra efforts to avoid unibyte names, Stefan Monnier, 2018/01/05
    - Re: Why does dired go through extra efforts to avoid unibyte names, Eli Zaretskii, 2018/01/05

Prev by Date: Re: git history tracking across renames (and emacs support)
Next by Date: Re: beginning-of-defun-comments bug [was: Re: 26.0.90: mark-defun problem in c-mode]
Previous by thread: Re: Delimited continuations
Next by thread: Re: Why does dired go through extra efforts to avoid unibyte names
Index(es):
- Date
- Thread