[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] wget fails to encode spaces in URLs

From: Giuseppe Scrivano
Subject: Re: [Bug-wget] wget fails to encode spaces in URLs
Date: Sun, 05 Jun 2011 14:28:31 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.0.50 (gnu/linux)

Hi Volker,

thanks to have reported this bug but it was fixed in the development
version of wget and the fix will be included in the next release.

Can you please confirm if it works for you?

You can fetch a source tarball here:


Volker Kuhlmann <address@hidden> writes:

>  > wget --version
> GNU Wget 1.12 built on linux-gnu.
> To reproduce:
> Go to any sourceforge project and download a file whos URL contains a
> space. Copy the "direct link" from the download page into wget -i-
> Run wireshark and press ^D in the wget input stream.
> If the upstream strips spaces (e.g. squid, default setting in pfsense)
> the download goes round in circles.
> The bug does not exist in wget when passing the URL on the command line.
> I always use -i- because of all the shell crud in URLs.
> I am using the openSUSE 11.4 version, but the only source code change is
> additional support for libproxy.
> Problem:
> Looking at the source, in main.c url_parse() is called for each URL from
> the command line. For -i, it calls retrieve_from_file().
> retrieve_from_file() (in retr.c) reads a list of URLs from the given
> file. It then calls url_parse() only if IRI is enabled (which in my
> version of wget is not even compiled in).
> Hence the URL is never parsed and never encoded before being downloaded
> with retrieve_url().
> That's a bug.
> The fix is probably to always call url_parse() in retrieve_from_file(),
> and not only when IRI is turned on.
> If this goes to a mailing list, please cc me on replies, I am not
> subscribed.
> Thanks,
> Volker

reply via email to

[Prev in Thread] Current Thread [Next in Thread]