bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #60287] Windows recursive download escapes utf8 URLs twice


From: Cameron Tacklind
Subject: [bug #60287] Windows recursive download escapes utf8 URLs twice
Date: Fri, 26 Mar 2021 03:23:06 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36

Follow-up Comment #3, bug #60287 (project wget):

Thank you. I had not tried those options.

Curiously, the only option that I needed was *--local-encoding=utf8*. The
remote-encoding option did not change the detected URI encoding of CP1252.

*Without --local-encoding=utf8*


Loaded example.com/wget-test.html (size 71).
URI encoding = 'CP1252'
example.com/wget-test.html: merge('http://example.com/wget-test.html',
'space-ok%20cyrillic-not%D0%B3.txt') ->
http://example.com/space-ok%20cyrillic-not%D0%B3.txt
converted 'http://example.com/space-ok%20cyrillic-not%D0%B3.txt' (CP1252) ->
'http://example.com/space-ok cyrillic-notг.txt' (UTF-8)
appending 'http://example.com/space-ok%20cyrillic-not%C3%90%C2%B3.txt' to
urlpos.


*With --local-encoding=utf8*


Loaded example.com/wget-test.html (size 71).
URI encoding = 'utf8'
example.com/wget-test.html: merge('http://example.com/wget-test.html',
'space-ok%20cyrillic-not%D0%B3.txt') ->
http://example.com/space-ok%20cyrillic-not%D0%B3.txt
converted 'http://example.com/space-ok%20cyrillic-not%D0%B3.txt' (utf8) ->
'http://example.com/space-ok cyrillic-notг.txt' (UTF-8)
appending 'http://example.com/space-ok%20cyrillic-not%D0%B3.txt' to urlpos.


Regardless, this still feels like a bug to me. But maybe the issue is just how
wget implements the recursive download and isn't really fixable?

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?60287>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]