bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] incorrect urldecoding


From: Micah Cowan
Subject: Re: [Bug-wget] incorrect urldecoding
Date: Tue, 24 May 2011 13:00:45 -0700
User-agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.14) Gecko/20110223 Thunderbird/3.1.8

As you've discovered the IRI support doesn't change anything about how
filenames are saved; it only translates between IRIs and URIs (which,
since there are no IRIs involved here, doesn't affect anything).

As a workaround until filename transcoding is supported in wget, you may
find that --restrict-file-names=nocontrol does what you need it to -
provided the encoding of the characters in the URL and the encoding for
your system match.

-mjc

(05/24/2011 01:23 AM), kns wrote:
> Hello.
> 
> We have:
> 
> utf-8 urlencoded link: 
> http://lurkmore.ru/images/8/89/%D0%AD%D1%82%D1%8C%D0%B5%D0%BD_%D0%94%D1%8E%D0%BC%D0%BE%D0%BD.jpeg
> 
> wget on cygwin:
> $ wget --version
> GNU Wget 1.12 built on cygwin.
> 
> +digest +ipv6 +nls +ntlm +opie +md5/openssl +https -gnutls +openssl
> +iri
> 
> ---------
> 
> $ wget -o ./w.log --local-encoding=utf-8 --remote-encoding=utf-8 
> http://lurkmore.ru/images/8/89/%D0%AD%D1%82%D1%8C%D0%B5%D0%BD_%D0%94%D1%8E%D0%BC%D0%BE%D0%BD.jpeg
> 
> $ cat w.log
> --2011-05-24 12:19:39--  
> http://lurkmore.ru/images/8/89/%D0%AD%D1%82%D1%8C%D0%B5
> %D0%BD_%D0%94%D1%8E%D0%BC%D0%BE%D0%BD.jpeg
> Resolving lurkmore.ru (lurkmore.ru)... 174.122.234.203
> Connecting to lurkmore.ru (lurkmore.ru)|174.122.234.203|:80... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 39532 (39K) [image/jpeg]
> Saving to: `Э\321%82\321%8Cен_\320%94\321%8Eмон.jpeg'
> 
>      0K .......... .......... .......... ........             100% 45.1K=0.9s
> 
> 2011-05-24 12:19:41 (45.1 KB/s) - `Э\321%82\321%8Cен_\320%94\321%8Eмон.jpeg' 
> sav
> ed [39532/39532]
> 
> --------
> Wget writes "Э\321%82\321%8Cен_\320%94\321%8Eмон.jpeg" 
> (Э%82%8Cен_%94%8Eмон.jpeg) instead of "Этьен_Дюмон.jpeg"
> 
> 
> Debian version without iri support does the same.


-- 
Micah J. Cowan
http://micah.cowan.name/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]