bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: --convert-links should not convert # hrefs


From: Christopher Gait-Smith
Subject: Re: --convert-links should not convert # hrefs
Date: Sat, 26 Mar 2022 22:25:39 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.7.0

The site I originally found this bug on was https://extravm.com, using the command:

`wget --page-requisites --convert-links https://extravm.com/`

The issue appeared on the first carousel, between lines 120 - 121 in index.html. When the href was changed to "index.html#carouselHero", the slider did not work. Changing it back
to just "#carouselHero" made the slider work.

I also noticed that wGET2 does not convert an href="#example" to href="site#example"

Besides, there would be no point to explicitly make the href point to the page it is on, as it does that by default. The only reason why an href with a # should be converted is if its href="newurl#something"


TL;DR href="#example" should not be converted to href="file.extension#example" and should remain in the original form. href="file.html#example" would need to be converted, though.


On 2022/03/26 20:50, Tim Rühsen wrote:
On 26.03.22 11:20, Christopher Gait-Smith wrote:
When using the --convert-links in wGET, sites will often become broken if they contain:

<a href="#example"></a>

Will get converted to:

<a href="index.html#example"></a>


I believe that hrefs that just have a #whatever should not get converted to file.html#whatever

I can see how this breaks. Without a file name, wget assumes `index.html`. But this file a) doesn't need to exist on the server or b) the default file is something else (e.g. index.php) but index.html also exists on the server.

Do you have one or two examples of pages or domains where this happens ?

Regards, Tim


reply via email to

[Prev in Thread] Current Thread [Next in Thread]