bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] wget converts escape sequences into non-ASCII characters


From: Micah Cowan
Subject: Re: [Bug-wget] wget converts escape sequences into non-ASCII characters in filenames
Date: Sat, 01 Nov 2008 09:27:43 -0700
User-agent: Thunderbird 2.0.0.17 (X11/20080925)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Vincent Lefevre wrote:
> GNU Wget 1.11.4 converts escape sequences such as %E9 into non-ASCII
> characters in filenames. Under GNU/Linux, this can make filenames
> unreadable (because there's no standard for non-ASCII characters in
> filenames). Worse, when the filesystem doesn't support such non-ASCII
> data (e.g. HFS+ under Mac OS X, which expects UTF-8 only), wget fails.

Yeah, we're aware of this problem (which has pretty much always been
present); we plan to address it for the Wget 1.12 release.
https://savannah.gnu.org/bugs/?21793

> Cannot write to `www.bruit.fr/FR/print/R�glementation/01030100' (No such file 
> or directory).

I didn't realize that it had more drastic effects than producing
unreadable/untypeable filenames; there is a patch you might be able to
use attached to the bug report above. It's not what we're going to use
for Wget 1.12, as it's not sufficiently flexible (always assumes UTF-8);
but it sounds like it might do for you.

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
GNU Maintainer: wget, screen, teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFJDIN/7M8hyUobTrERAs2MAJ9f5YKcORLLayR8NUcHEhSeNQMUhACghp4x
+/HsJ8iGdM0+hZNnQywT1rc=
=Hgdm
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]