bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] [bug #47701] wget 1.17.1 fails to convert from percent encodi


From: anonymous
Subject: [Bug-wget] [bug #47701] wget 1.17.1 fails to convert from percent encoding to unicode correctly (mingw32)
Date: Fri, 15 Apr 2016 04:31:09 +0000
User-agent: Mozilla/5.0 (Windows NT 10.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.112 Safari/537.36

URL:
  <http://savannah.gnu.org/bugs/?47701>

                 Summary: wget 1.17.1 fails to convert from percent encoding
to unicode correctly (mingw32)
                 Project: GNU Wget
            Submitted by: None
            Submitted on: Fri 15 Apr 2016 04:31:08 AM UTC
                Category: Program Logic
                Severity: 3 - Normal
                Priority: 5 - Normal
                  Status: None
                 Privacy: Public
             Assigned to: None
         Originator Name: Anonymous Coward
        Originator Email: 
             Open/Closed: Open
         Discussion Lock: Any
                 Release: 1.17.1
        Operating System: Microsoft Windows
         Reproducibility: None
           Fixed Release: None
         Planned Release: None
              Regression: None
           Work Required: None
          Patch Included: None

    _______________________________________________________

Details:

The version of GNU Wget you were using

1.17.1 (mingw32)

How you invoked wget


wget -d -r -np -nH --cut-dirs=4
"https://leoandpeto.com/Music/Peto/Non/o-z/R%C3%B6yksopp%20-%20The%20Understanding/";


(in case this is some sort of mingw32-compiler bug)
I obtained my binary copy of mingw32 wget from
https://eternallybored.org/misc/wget/ as recommended by
http://wget.addictivecode.org/FrequentlyAskedQuestions#download

What you expected wget to do

Recursively download the requested open directory.

What wget did (include output messages).

didn't download it.

I tried running with debug on, from looking at the
output it seems wget converted from the percent-
encoded URL to w32's native unicode encoding wrong.

(full copy (input and output) of my run of wget is
included as an attachment named "wget-output.txt".)

specifically, the output includes this line (exactly.)


converted
'https://leoandpeto.com/Music/Peto/Non/o-z/R%C3%B6yksopp%20-%20The%20Understanding/'
(ASCII) -> 'https://leoandpeto.com/Music/Peto/Non/o-z/RC6yksopp - The
Understanding/' (UTF-8)


wget changed the word from "R%C3%B6yksopp" to "RC6yksopp"
with no percent signs.

It seems to be stripping the leading first bit off of
each byte of %C3 (11000011) and %B6 (10110110), and so
converting them into their 7-bit ASCII equivalents:


,-----.----------.-------.
| HEX | BIN      | ASCII |
+-----+----------+-------+
| C3  | 11000011 | --    |
| 43  |  1000011 | C     |
+-----+----------+-------+
| B6  | 10110110 | --    |
| 36  |  0110110 | 6     |
`-----'----------'-------'


Also, I'm not 100% sure that this isn't a duplicate of
http://savannah.gnu.org/bugs/index.php?47689
but I figured it's best to let you developers decide
rather than failing to file a bug-report.

Thank you for making/working on wget, and
please CONTINUE BEING AWESOME! :-D



    _______________________________________________________

File Attachments:


-------------------------------------------------------
Date: Fri 15 Apr 2016 04:31:08 AM UTC  Name: wget-output.txt  Size: 5kB   By:
None

<http://savannah.gnu.org/bugs/download.php?file_id=36931>

    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?47701>

_______________________________________________
  Message sent via/by Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]