bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] non-ascii characters - url conversion


From: Micah Cowan
Subject: Re: [Bug-wget] non-ascii characters - url conversion
Date: Fri, 10 Apr 2009 10:51:59 -0700
User-agent: Thunderbird 2.0.0.21 (X11/20090318)

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Miroslav Oujeský wrote:
> Hello,
> 
> I am using wget for recursively downloading whole website (creating
> offline version), with the --convert-links option.
> 
> I have URL-encoded link which contains non-ascii characters. After the
> conversion, the link is not URL-encoded, and resulting file name
> contains some "rubbish" characters (I suspect something with UTF-8 =>
> iso-8859-1) - this leads to the converted link not working.
> 
> The original URL is "/tags/V%C3%BDprodej/"
> The converted URL is "../tags/Výprodej/index.html" (in UTF-8)
> The name of file (or directory in this case) is shown as "V??prodej"
> with ls command
> 
> Is there any possibility to let the URL and respective file name be
> with %XX encoding after conversion?
> 
> (not sure if it is relevant, but I am using wget devel 1.12, because
> of the css parsing)

This is currently being tracked here: https://savannah.gnu.org/bugs/?21793

- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
Maintainer of GNU Wget and GNU Teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAknfhz8ACgkQ7M8hyUobTrEdbACfcvjuNKV8IvJnItPI1aNQTRlR
0dUAnRZ3grv0bKlHP7Li4AyLVoV47VgX
=7X5w
-----END PGP SIGNATURE-----




reply via email to

[Prev in Thread] Current Thread [Next in Thread]