guile-user
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: guile can't find a chinese named file


From: tomas
Subject: Re: guile can't find a chinese named file
Date: Wed, 15 Feb 2017 21:20:56 +0100
User-agent: Mutt/1.5.21 (2010-09-15)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, Feb 15, 2017 at 06:59:14PM +0200, Eli Zaretskii wrote:
> > Date: Wed, 15 Feb 2017 10:18:32 +0100
> > From: <address@hidden>

[...]

> > Most notably, the whole path might cross several mount points, thus
> > the whole path can well have fragments coming from several file systems.
> 
> A possible solution would be to decode each mount point's part as it
> is being resolved.

...which can only be based on guesswork: there's no reliable info on
the encoding used for that file system (if it's consistent at all).

What can we do? Try different encodings until one "works"? That amounts
to trying UTF-8 and then some Latin-x (for any x), which would fit,
for any x.

> > I think the only sane way to see a Linux file system path is the way
> > Linux sees it: as a byte string.
> 
> This would lose a lot in 99% of use cases.  You are, in effect,
> suggesting a "reverse optimization", whereby the majority of use cases
> is punished in favor of a small minority, based on theoretical
> intractability.

I feel queasy doing some voodoo whithout the application having
a word on it. In the Emacs context it's a bit easier, because in
the "normal" case things are pretty quickly deferred to the user
(usually).

> > Sure, some helper infrastructure to try to make characters of that
> > mess will be welcome, but that should be absolutely robust wrt.
> > unexpected input e.g. bad UTF-8) and leave control to the application.
> 
> Most applications won't like this burden, because most application
> programmers don't know enough about the issue to solve them correctly,
> especially for users of other OSes and locales.
> 
> > > But if OpenBSD requires all _filenames_ to be in valid UTF-8, that
> > > is a bad decision in my view.
> > 
> > NT has done that too.
> 
> Windows can do that because it also transparently translates file
> names to the locale's encoding when files are accessed with ANSI APIs.
> Without such translation, this kind of decision is unwise, IMO.

I guess (I don't *know*) Windows stores information about the encoding
at file system level (and keeps that consistent). Linux hasn't that,
it just keeps out of it. It hasn't even a place to state the encoding
used.

Thanks&regards
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlikuCgACgkQBcgs9XrR2kauCACfTpfRpHhL2iUJXET5zqokA6US
+pkAnjIc7Q+hBPj9Vi9Pk46AsmI3yA5m
=RXAn
-----END PGP SIGNATURE-----



reply via email to

[Prev in Thread] Current Thread [Next in Thread]