[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] bad filenames (again)
From: |
Eli Zaretskii |
Subject: |
Re: [Bug-wget] bad filenames (again) |
Date: |
Wed, 19 Aug 2015 17:12:08 +0300 |
> Date: Tue, 18 Aug 2015 22:28:21 +0200
> From: "Andries E. Brouwer" <address@hidden>
> Cc: "Andries E. Brouwer" <address@hidden>, address@hidden,
> address@hidden
>
> > What is needed to have a full Unicode support in wget on Windows is to
> > provide replacements for all the file-name related libc functions
> > ('fopen', 'open', 'stat', 'access', etc.) which will accept file names
> > encoded in UTF-8, convert them internally into UTF-16, and call the
> > wchar_t equivalents of those functions ('_wfopen', '_wopen', '_wstat',
> > '_waccess', etc.) with the converted file name. Another thing that is
> > needed is similar replacements for 'printf', 'puts', 'fprintf',
> > etc. when they are used for writing file names to the console --
> > because we cannot write UTF-8 sequences to the Windows console.
>
> Aha. That reminds me of a patch by I think Aleksey Bykov.
> Yes - see http://lists.gnu.org/archive/html/bug-wget/2014-04/msg00080.html
>
> There we had a similar discussion, and he wrote mswindows.diff with
>
> +int
> +wc_utime (unsigned char *filename, struct _utimbuf *times)
> +{
> + wchar_t *w_filename;
> + int buffer_size;
> +
> + buffer_size = sizeof (wchar_t) * MultiByteToWideChar(65001, 0, filename,
> -1,
> w_filename, 0);
> + w_filename = alloca (buffer_size);
> + MultiByteToWideChar(65001, 0, filename, -1, w_filename, buffer_size);
> + return _wutime (w_filename, times);
> +}
>
> and similar for stat, open, etc. Something similar is what would be needed on
> Windows?
Yes, thanks for pointing out those patches. Any reasons they weren't
accepted back then?
> Is his patch usable?
It needs some minor polishing, but in general it should do the job,
yes.
I admit that I don't understand the need for the url.c patch. Why do
we need to convert to wchar_t when the locale's codeset is already
UTF-8? (I could understand that for non-UTF-8 locales, but the patch
explicitly limits the conversion to wchar_t and back to UTF-8 locales,
where the normal string functions should do the job.) Is this only
for converting to upper/lower-case?
There's still the part with writing UTF-8 encoded file/URL names to
the Windows console; that will have to be added.
- Re: [Bug-wget] bad filenames (again), (continued)
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/19
- Re: [Bug-wget] bad filenames (again), Ángel González, 2015/08/20
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/20
- Re: [Bug-wget] bad filenames (again), Ángel González, 2015/08/23
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/23
- Re: [Bug-wget] bad filenames (again), Ángel González, 2015/08/23
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/23
- Re: [Bug-wget] bad filenames (again),
Eli Zaretskii <=
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Tim Ruehsen, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/13
- Re: [Bug-wget] bad filenames (again), Tim Rühsen, 2015/08/13
- [Bug-wget] AM_PATH_GPGME, Andries E. Brouwer, 2015/08/13
- Re: [Bug-wget] AM_PATH_GPGME, Tim Ruehsen, 2015/08/14
- Re: [Bug-wget] AM_PATH_GPGME, Andries E. Brouwer, 2015/08/14
- Re: [Bug-wget] bad filenames (again), Darshit Shah, 2015/08/15
- Re: [Bug-wget] [PATCH] bad filenames (again), Ander Juaristi, 2015/08/16