bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[bug #60287] Windows recursive download escapes utf8 URLs twice


From: Cameron Tacklind
Subject: [bug #60287] Windows recursive download escapes utf8 URLs twice
Date: Sat, 27 Mar 2021 16:19:36 -0400 (EDT)
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.90 Safari/537.36

Follow-up Comment #9, bug #60287 (project wget):

So is this a fundamental problem with HTML? That you can't encode in the HTML
that urlencoded bytes of URLs in a.href are to be interpreted with a
particular charset?

Note the problem I'm seeing causes a 404 error *before* it even gets the file
with the non-ascii file name.

Wget is reading a file from disk with only ascii in it. The ascii that is in
the a.href of the downloaded file needs to be directly sent in the subsequent
HTTP request line.

It seems to me that converting to the local character set should not happen at
all.

    _______________________________________________________

Reply to this item at:

  <https://savannah.gnu.org/bugs/?60287>

_______________________________________________
  Message sent via Savannah
  https://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]