bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Problem downloading with RIGHT SINGLE QUOTATION MARK (U+2


From: Tim Rühsen
Subject: Re: [Bug-wget] Problem downloading with RIGHT SINGLE QUOTATION MARK (U+2019) in filename
Date: Fri, 11 Oct 2019 18:22:35 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0

On 11.10.19 11:07, Eli Zaretskii wrote:
>> From: Cameron Tacklind <address@hidden>
>> Date: Thu, 10 Oct 2019 20:31:02 -0700
>>
>> The error is pretty clearly an encoding conversion issue, going from UTF-8,
>> assumed to be CP1252, converting into UTF-8, which becomes wrong.
> 
> I think you need to tell Wget that the page encoding is UTF-8, by
> using the --remote-encoding switch.  Did you try that?
> 

Cameron's html file contains a 'meta' tag with attribute
'charset=utf-8'. So wget should detect it and convert the URL correctly.

And I can confirm that wget is working properly here. My version is
1.20.3 and I am working on Linux.

I put this file onto my local apache web server and named it quote.html:

<!DOCTYPE html><html><head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>RIGHT SINGLE QUOTE TEST</title>
</head><body>
<a href="%E2%80%99">test</a>
</body></html>

My command line is
  wget -d -r http://localhost/quote.html

Output is
...
Decided to load it.
URI encoding = »utf-8«
Enqueuing http://localhost/%E2%80%99 at depth 1
Queue count 1, maxcount 1.
[IRI Enqueuing »http://localhost/%E2%80%99« with »utf-8«
Dequeuing http://localhost/%E2%80%99 at depth 1
Queue count 0, maxcount 1.
Converted file name 'localhost/’' (UTF-8) -> 'localhost/’' (UTF-8)
--2019-10-11 18:06:21--  http://localhost/%E2%80%99
...
---request begin---
GET /%E2%80%99 HTTP/1.1
Referer: http://localhost/quote.html
User-Agent: Wget/1.20.3 (linux-gnu)
Accept: */*
Accept-Encoding: identity
Host: localhost
Connection: Keep-Alive

---request end---
...


@Cameron: Your wget version seems ok, so I am a bit clueless right.now...

Could you give me the output of 'wget --version' ?
Could you test in the same way as I did above to see if that is
reproducible for you or not ?

Regards, Tim

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]