bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] [bug #56244] Bug with different quotation marks


From: anonymous
Subject: [Bug-wget] [bug #56244] Bug with different quotation marks
Date: Tue, 30 Apr 2019 06:22:44 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:66.0) Gecko/20100101 Firefox/66.0

URL:
  <https://savannah.gnu.org/bugs/?56244>

                 Summary: Bug with different quotation marks
                 Project: GNU Wget
            Submitted by: None
            Submitted on: Tue 30 Apr 2019 10:22:42 AM UTC
                Category: Program Logic
                Severity: 3 - Normal
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: Stefan Kittel
        Originator Email: address@hidden
             Open/Closed: Open
         Discussion Lock: Any
                 Release: 1.20
        Operating System: Microsoft Windows
         Reproducibility: Every Time
           Fixed Release: None
         Planned Release: None
              Regression: None
           Work Required: None
          Patch Included: No

    _______________________________________________________

Details:

Hello,

i tried to use wget for a project to download a specific website.
The problem is that some urls are wrong interpretated.

%E2%80%9C ist being put into the url and hex chars are used for the filename.
So I can't access these files.

I called wget with this parameters
z:\temp\wget\wget.exe -e robots=off
--output-file="Z:\temp\webshield\test1\wget.log" --save-headers
--user-agent="WP-Shield Transfer" --tries=5 --timeout=30 --recursive
--restrict-file-names=nocontrol --local-encoding=utf-8 --remote-encoding=utf-8
--level=0 --page-requisites --content-on-error --no-parent --no-cookies
https://www.netzplan.de/

This is in the log
--2019-04-30 11:53:10-- 
https://www.netzplan.de/%E2%80%9C/wp-content/plugins/revslider/public/assets/fonts/revicons/revicons.woff?5510888%E2%80%9D
Reusing existing connection to www.netzplan.de:443.
HTTP request sent, awaiting response... 404 Not Found
Saving to:
'www.netzplan.de/â\200œ/wp-content/plugins/revslider/public/assets/fonts/revicons/address@hidden'

Seen in this website
https://www.netzplan.de/

A href should be href="
But here is href=“ used

So wget should handle “ and ” the same a "

Thanks

Stefan




    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?56244>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]