[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] non-ascii characters - url conversion
From: |
Micah Cowan |
Subject: |
Re: [Bug-wget] non-ascii characters - url conversion |
Date: |
Fri, 10 Apr 2009 10:51:59 -0700 |
User-agent: |
Thunderbird 2.0.0.21 (X11/20090318) |
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Miroslav Oujeský wrote:
> Hello,
>
> I am using wget for recursively downloading whole website (creating
> offline version), with the --convert-links option.
>
> I have URL-encoded link which contains non-ascii characters. After the
> conversion, the link is not URL-encoded, and resulting file name
> contains some "rubbish" characters (I suspect something with UTF-8 =>
> iso-8859-1) - this leads to the converted link not working.
>
> The original URL is "/tags/V%C3%BDprodej/"
> The converted URL is "../tags/Výprodej/index.html" (in UTF-8)
> The name of file (or directory in this case) is shown as "V??prodej"
> with ls command
>
> Is there any possibility to let the URL and respective file name be
> with %XX encoding after conversion?
>
> (not sure if it is relevant, but I am using wget devel 1.12, because
> of the css parsing)
This is currently being tracked here: https://savannah.gnu.org/bugs/?21793
- --
Micah J. Cowan
Programmer, musician, typesetting enthusiast, gamer.
Maintainer of GNU Wget and GNU Teseq
http://micah.cowan.name/
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAknfhz8ACgkQ7M8hyUobTrEdbACfcvjuNKV8IvJnItPI1aNQTRlR
0dUAnRZ3grv0bKlHP7Li4AyLVoV47VgX
=7X5w
-----END PGP SIGNATURE-----