[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] incorrect urldecoding
From: |
Micah Cowan |
Subject: |
Re: [Bug-wget] incorrect urldecoding |
Date: |
Tue, 24 May 2011 13:00:45 -0700 |
User-agent: |
Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8 |
As you've discovered the IRI support doesn't change anything about how
filenames are saved; it only translates between IRIs and URIs (which,
since there are no IRIs involved here, doesn't affect anything).
As a workaround until filename transcoding is supported in wget, you may
find that --restrict-file-names=nocontrol does what you need it to -
provided the encoding of the characters in the URL and the encoding for
your system match.
-mjc
(05/24/2011 01:23 AM), kns wrote:
> Hello.
>
> We have:
>
> utf-8 urlencoded link:
> http://lurkmore.ru/images/8/89/%D0%AD%D1%82%D1%8C%D0%B5%D0%BD_%D0%94%D1%8E%D0%BC%D0%BE%D0%BD.jpeg
>
> wget on cygwin:
> $ wget --version
> GNU Wget 1.12 built on cygwin.
>
> +digest +ipv6 +nls +ntlm +opie +md5/openssl +https -gnutls +openssl
> +iri
>
> ---------
>
> $ wget -o ./w.log --local-encoding=utf-8 --remote-encoding=utf-8
> http://lurkmore.ru/images/8/89/%D0%AD%D1%82%D1%8C%D0%B5%D0%BD_%D0%94%D1%8E%D0%BC%D0%BE%D0%BD.jpeg
>
> $ cat w.log
> --2011-05-24 12:19:39--
> http://lurkmore.ru/images/8/89/%D0%AD%D1%82%D1%8C%D0%B5
> %D0%BD_%D0%94%D1%8E%D0%BC%D0%BE%D0%BD.jpeg
> Resolving lurkmore.ru (lurkmore.ru)... 174.122.234.203
> Connecting to lurkmore.ru (lurkmore.ru)|174.122.234.203|:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 39532 (39K) [image/jpeg]
> Saving to: `Э\321%82\321%8Cен_\320%94\321%8Eмон.jpeg'
>
> 0K .......... .......... .......... ........ 100% 45.1K=0.9s
>
> 2011-05-24 12:19:41 (45.1 KB/s) - `Э\321%82\321%8Cен_\320%94\321%8Eмон.jpeg'
> sav
> ed [39532/39532]
>
> --------
> Wget writes "Э\321%82\321%8Cен_\320%94\321%8Eмон.jpeg"
> (Э%82%8Cен_%94%8Eмон.jpeg) instead of "Этьен_Дюмон.jpeg"
>
>
> Debian version without iri support does the same.
--
Micah J. Cowan
http://micah.cowan.name/