bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] [PATCH] Re: --convert-links and filenames with colons


From: Tim Ruehsen
Subject: [Bug-wget] [PATCH] Re: --convert-links and filenames with colons
Date: Tue, 27 Oct 2015 13:19:22 +0100
User-agent: KMail/4.14.10 (Linux/4.2.0-1-amd64; KDE/4.14.13; x86_64; ; )

Hi Joachim,

could please test the attached patch if it works for you ?

Could anyone else review it !?

Tim

On Monday 26 October 2015 13:42:41 Joachim Breitner wrote:
> Dear wget developers,
> 
> it seems that "wget -r -k" is a bit careless with creating relative
> URLs that start with “something:”, which would then be mis-interpreted
> as the protocol specification of an URL.
> 
> For example, downloading these two files:
> 
> /tmp/wget/input $ head *
> ==> file:with:colon.html <==
> <html>
> <body>
> <a href="./file:with:colon.html">Foo</a>
> <a href="./file_without_colon.html">Bar</a>
> </body>
> </html>
> 
> ==> file_without_colon.html <==
> <html>
> <body>
> <a href="./file:with:colon.html">Foo</a>
> <a href="./file_without_colon.html">Bar</a>
> </body>
> </html>
> 
> with "wget -k -r" produces this output:
> 
> ==> localhost:8000/file:with:colon.html <==
> <html>
> <body>
> <a href="file:with:colon.html">Foo</a>
> <a href="file_without_colon.html">Bar</a>
> </body>
> </html>
> 
> ==> localhost:8000/file_without_colon.html <==
> <html>
> <body>
> <a href="file:with:colon.html">Foo</a>
> <a href="file_without_colon.html">Bar</a>
> </body>
> </html>
> 
> and the browser will not be able to follow the link to Foo.
> 
> This is a practical problem when trying to mirror a mediawiki
> installation.
> I suggest to avoid the issue by prepending relative links with "./",
> either always (why not?), or when there relative file name started with
> something that looks like “foo:”.
> 
> 
> Thanks,
> Joachim

Attachment: 0001-Fix-URL-conversion-for-colons-in-filenames.patch
Description: Text Data


reply via email to

[Prev in Thread] Current Thread [Next in Thread]