[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] bad filenames (again)
From: |
Tim Ruehsen |
Subject: |
Re: [Bug-wget] bad filenames (again) |
Date: |
Tue, 18 Aug 2015 11:58:54 +0200 |
User-agent: |
KMail/4.14.2 (Linux/4.1.0-1-amd64; KDE/4.14.2; x86_64; ; ) |
On Tuesday 18 August 2015 10:55:46 Andries E. Brouwer wrote:
> On Tue, Aug 18, 2015 at 10:29:40AM +0200, Tim Ruehsen wrote:
> > I am going with Eli that we should use iconv.
> > We know the remote encoding and the local encoding
>
> Do we?
> How do you guess the remote encoding?
> Is there any particular encoding?
Yes we do.
Starting with 'wget URL', URL has the local encoding (can be overridden by --
local-encoding).
Using wget -r will download documents (HTML and CSS right now) and parse them
for more URLs. These documents have a well known encoding (either by default
or by explicit setting via HTTP header or document settings). For broken
servers, we still have --remote-encoding.
> Unix filenames are sequences of bytes, they do not have a character set.
The character encoding makes with what symbols these bytes (or byte sequences
aka multibyte / codepoints) are displayed for you. I gave an example in my
last email.
Change your locale to iso-8859-1 and make a 'touch äöü'. 'ls' will show it
correctly. Then change your locale to UTF-8 and now 'ls' will show garbage
though your file name did not change.
Tim
signature.asc
Description: This is a digitally signed message part.
- Re: [Bug-wget] bad filenames (again), (continued)
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/16
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/16
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/16
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Tim Ruehsen, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again),
Tim Ruehsen <=
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18