[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] bad filenames (again)

From: Eli Zaretskii
Subject: Re: [Bug-wget] bad filenames (again)
Date: Tue, 18 Aug 2015 22:31:31 +0300

> Date: Tue, 18 Aug 2015 21:11:25 +0200
> From: "Andries E. Brouwer" <address@hidden>
> Cc: "Andries E. Brouwer" <address@hidden>, address@hidden,
>         address@hidden
> On Tue, Aug 18, 2015 at 09:15:40PM +0300, Eli Zaretskii wrote:
> > > Otherwise? Leave it as it is?
> >
> > No, encode it as %XX hex escapes, thus making the file name pure
> > ASCII.  And have an option to leave it "as is", so people who want
> > that could have that.
> OK, I can live with that.

Great, I'm glad we've found an agreeable compromise.

> So, I see that you want to use iconv to convert UTF-8 to the current
> codepage, so that Windows can convert that to UTF-16 again.


> As stated several times already I have zero experience on Windows,
> but is it possible to let wget change its current codepage to Unicode
> so that the Windows conversion is close to the identity map?

No, it's not possible.  Windows does have a UTF-8 codepage, but it
doesn't allow setting that as the system codepage.

What is needed to have a full Unicode support in wget on Windows is to
provide replacements for all the file-name related libc functions
('fopen', 'open', 'stat', 'access', etc.) which will accept file names
encoded in UTF-8, convert them internally into UTF-16, and call the
wchar_t equivalents of those functions ('_wfopen', '_wopen', '_wstat',
'_waccess', etc.) with the converted file name.  Another thing that is
needed is similar replacements for 'printf', 'puts', 'fprintf',
etc. when they are used for writing file names to the console --
because we cannot write UTF-8 sequences to the Windows console.  Doing
this is not rocket science (I did something similar for Emacs last
year), but more work than just a call to iconv that's needed on Unix.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]