[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Problem with ÅÄÖ and wget

From: Ángel González
Subject: Re: [Bug-wget] Problem with ÅÄÖ and wget
Date: Sun, 15 Sep 2013 02:54:07 +0200
User-agent: Thunderbird

On 15/09/13 00:59, Bykov Aleksey wrote:

Great thanks for pushing in correct direction.

With attached patch Wget in Windows can work with UTF-8 names. But - also only with "--restrict-file-names=nocontrol"...
I think there are two issues:
- Make wget recognise utf-8 urls and accept them without nocontrol when the filesystem encoding is utf-8.
- Correctly store the filenames in Windows.

I would have started with the first one, and then treat Windows as utf-8 enabled fs, which is what this patch does. Also, isn't there any library doing already this?

diff --git a/src/utils.c b/src/utils.c
index 2ec9601..6307c88 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -2544,3 +2544,42 @@ test_dir_matches_p()

  #endif /* TESTING */

+#ifdef WINDOWS
+/* For UTF-8 in Windows support. Replacement standart fopen() utime() stat() 
lstat() mkdir() with wide character
+analogs route. w_fopen() declared in utils.h, w_utime(), w_stat() and w_mkdir 
- in utils.c */

This code should be on mswindows.c
What mades w_fopen() different so it is on utils.h instead of the .c?

Commenting on just one function, as they all follow the same templte:

+w_stat (const char *filename, struct_stat *buffer )
+  wchar_t *w_filename;
+  int buffer_size = 1024; /* I cant push it to work with strlen() */
What happens if the filename has more than 1024 characters?
+  w_filename = malloc (buffer_size);
+  MultiByteToWideChar(65001, 0, filename, -1, w_filename, buffer_size);
Using CP_UTF8 instead of 65001 would be preferable IMHO.

Big bug. The sixth argument is the space available for w_filename *in characters*, not bytes. I would multiply buffer_size by sizeof(wchar_t) in the malloc (although you could instead divide here, too).

+  int res = _wstati64 (w_filename, buffer);
It would be better to declare res at the beginning of the function.

+  free (w_filename);
+  return res;
Why bother allocating memory, if you are using a fixed size? Another opiton would be to use alloca()

I guess rename() would also need a wrapper.

reply via email to

[Prev in Thread] Current Thread [Next in Thread]