[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Problem with ÅÄÖ and wget
From: |
Bykov Aleksey |
Subject: |
Re: [Bug-wget] Problem with ÅÄÖ and wget |
Date: |
Mon, 16 Sep 2013 00:36:03 +0300 |
User-agent: |
Opera Mail/12.14 (Win32) |
Greetings
Thanks for correcting.
Sorry for unclean code and troubling.
- Make wget recognise utf-8 urls and accept them without nocontrol when
the filesystem encoding is utf-8.
Did You sure? UTF-8 name can contain colon (i remember, that see likewise
files). And at
least in Windows colon still to be restricted char.
I think, that it is possible to use current --restrict-file-names logic,
just with add convert to widechar (and vistaversa), add checking only
symbols with code lower that 256 and in pair place replace type from
"char" to "wchar_t". Need to check. Sorry, after some time.
What happens if the filename has more than 1024 characters?
Just filename crop. Now buffer_size determines by MultiByteToWideChar. Not
sure, that it need now multipling by sizeof(wchar_t).
Big bug. The sixth argument is the space available for w_filename *in
characters*, not bytes.
Why bother allocating memory, if you are using a fixed size? Another
opiton would be to use alloca()
I guess rename() would also need a wrapper.
Thanks.
This code should be on mswindows.c
I'm just forgot about mswindows.*. Yes, it much more situable place.
What mades w_fopen() different so it is on utils.h instead of the .c?
Sorry, i dont know. I had very little experience to understood.
Can You please take look and say what i do wrong?
I remember (belive?) that in NAME.h must be function declaration, and in
NAME.c - function body. And only if exist declaration in (included
directly or indirectly) NAME.h, other files can receive access to function
body. But with that structure my code not work.
Now all functions in utils.h (except w_fopen() ) can work in other files
without declaration, and w_fopen work only then its body in utils.h . In
attachment diffs for working and non-working variants (sorry, it based on
utils.* because in mswindows.h it is not worked at all. It must be just
appending code to tail).
--
Best regars, Alex
On Sun, 15 Sep 2013 03:54:07 +0300, Ángel González <address@hidden>
wrote:
On 15/09/13 00:59, Bykov Aleksey wrote:
Greetings
Great thanks for pushing in correct direction.
With attached patch Wget in Windows can work with UTF-8 names. But -
also only with "--restrict-file-names=nocontrol"...
I think there are two issues:
- Make wget recognise utf-8 urls and accept them without nocontrol when
the filesystem encoding is utf-8.
- Correctly store the filenames in Windows.
I would have started with the first one, and then treat Windows as utf-8
enabled fs, which is what this patch does. Also, isn't there any library
doing already this?
diff --git a/src/utils.c b/src/utils.c
index 2ec9601..6307c88 100644
--- a/src/utils.c
+++ b/src/utils.c
@@ -2544,3 +2544,42 @@ test_dir_matches_p()
#endif /* TESTING */
+#ifdef WINDOWS
+/* For UTF-8 in Windows support. Replacement standart fopen() utime()
stat() lstat() mkdir() with wide character
+analogs route. w_fopen() declared in utils.h, w_utime(), w_stat() and
w_mkdir - in utils.c */
This code should be on mswindows.c
What mades w_fopen() different so it is on utils.h instead of the .c?
Commenting on just one function, as they all follow the same templte:
+int
+w_stat (const char *filename, struct_stat *buffer )
+{
+ wchar_t *w_filename;
+ int buffer_size = 1024; /* I cant push it to work with strlen() */
What happens if the filename has more than 1024 characters?
+ w_filename = malloc (buffer_size);
+ MultiByteToWideChar(65001, 0, filename, -1, w_filename, buffer_size);
Using CP_UTF8 instead of 65001 would be preferable IMHO.
Big bug. The sixth argument is the space available for w_filename *in
characters*, not bytes.
I would multiply buffer_size by sizeof(wchar_t) in the malloc (although
you could instead divide here, too).
+ int res = _wstati64 (w_filename, buffer);
It would be better to declare res at the beginning of the function.
+ free (w_filename);
+ return res;
+}
Why bother allocating memory, if you are using a fixed size? Another
opiton would be to use alloca()
I guess rename() would also need a wrapper.
non_work.diff
Description: Binary data
non_work_err.txt
Description: Text document
work.diff
Description: Binary data
- [Bug-wget] Problem with ÅÄÖ and wget, Björn Mattsson, 2013/09/12
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/12
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Rühsen, 2013/09/12
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/12
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/13
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Bykov Aleksey, 2013/09/13
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/13
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Bykov Aleksey, 2013/09/14
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/14
- Re: [Bug-wget] Problem with ÅÄÖ and wget,
Bykov Aleksey <=
Re: [Bug-wget] Problem with ÅÄÖ and wget, Björn Mattsson, 2013/09/12
Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Rühsen, 2013/09/12
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/13
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Björn Mattsson, 2013/09/13
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/16
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tony Lewis, 2013/09/16
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/16
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/17
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Ángel González, 2013/09/23
- Re: [Bug-wget] Problem with ÅÄÖ and wget, Tim Ruehsen, 2013/09/24