bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #60287] Windows recursive download escapes utf8 URLs twice


From: Cameron Tacklind
Subject: [bug #60287] Windows recursive download escapes utf8 URLs twice
Date: Thu, 25 Mar 2021 05:09:45 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36

URL:
  <https://savannah.gnu.org/bugs/?60287>

                 Summary: Windows recursive download escapes utf8 URLs twice
                 Project: GNU Wget
            Submitted by: cinderblock
            Submitted on: Thu 25 Mar 2021 09:09:42 AM UTC
                Category: None
                Severity: 3 - Normal
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: 
        Originator Email: 
             Open/Closed: Open
                 Release: 1.20
         Discussion Lock: Any
        Operating System: Microsoft Windows
         Reproducibility: Every Time
           Fixed Release: None
         Planned Release: None
              Regression: None
           Work Required: None
          Patch Included: No

    _______________________________________________________

Details:

Steps to reproduce:
1. On a web-server, create an html file with the contents:

<a href="space-ok%20cyrillic-not%D0%B3.txt">target-with-other-char</a>

2. Download that file recursively: `wget -r
http://example.com/wget-test.html`

On Linux, we get the expected (truncated) result:

...
2021-03-25 02:01:59 (4.51 MB/s) - ‘example.com/wget-test.html’ saved [71]

--2021-03-25 02:01:59--  http://example.com/space-ok%20cyrillic-not%D0%B3.txt
...

However on Windows, the urlencoded utf8 character is mangled and fails to
download.

...
2021-03-25 02:02:29 (4.51 MB/s) - ‘example.com/wget-test.html’ saved [71]

--2021-03-25 02:02:29-- 
http://example.com/space-ok%20cyrillic-not%C3%90%C2%B3.txt
...

Note that the space (%20) is not mangled.




    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?60287>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]