bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [bug #58354] Wget doesn't parse URIs starting with http:/


From: Jeffrey Walton
Subject: Re: [bug #58354] Wget doesn't parse URIs starting with http:/
Date: Tue, 12 May 2020 06:49:46 -0400

On Tue, May 12, 2020 at 6:45 AM Luca Bernardi <address@hidden> wrote:
>
> URL:
>   <https://savannah.gnu.org/bugs/?58354>
>
>                  Summary: Wget doesn't parse URIs starting with http:/
>                  Project: GNU Wget
>             Submitted by: f0ff
>             Submitted on: Tue 12 May 2020 10:45:17 AM UTC
>                 Category: None
>                 Severity: 3 - Normal
>                 Priority: 5 - Normal
>                   Status: None
>                  Privacy: Public
>              Assigned to: None
>          Originator Name:
>         Originator Email:
>              Open/Closed: Open
>                  Release: 1.14
>          Discussion Lock: Any
>         Operating System: GNU/Linux
>          Reproducibility: Every Time
>            Fixed Release: None
>          Planned Release: None
>               Regression: None
>            Work Required: None
>           Patch Included: No
>
>     _______________________________________________________
>
> Details:
>
> Hi,
> Wget refuses to parse URIs that start with http:/ (note single slash), e.g.
> http:/wp-includes/css/dist/block-library/style.min.css?ver=5.4.1. These are
> widely accepted by browsers.
>
> Command that I've used: `wget --user-agent=Mozilla --content-disposition
> --page-requisites --adjust-extension --restrict-file-names=windows -d -e
> robots=off -m -k -E -r -l 10 -p -N -F -P crawl  -nH $IP`

You may as well make the slashes optional in the protocol string.
Berners Lee does not like them anyway,
https://www.mentalfloss.com/uk/history/27802/10-inventors-who-came-to-regret-their-creations.

Jeff



reply via email to

[Prev in Thread] Current Thread [Next in Thread]