[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] bad filenames (again)
From: |
Andries E. Brouwer |
Subject: |
Re: [Bug-wget] bad filenames (again) |
Date: |
Tue, 18 Aug 2015 17:28:34 +0200 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On Tue, Aug 18, 2015 at 05:45:13PM +0300, Eli Zaretskii wrote:
> > All this is about the local situation. One cannot know "the character set"
> > of a filename because that concept does not exist in Unix.
>
> Of course, it exists. The _filesystem_ doesn't know it, but users do.
Usually, yes.
> > About the remote situation even less is known.
>
> Assuming UTF-8 will go a long way towards resolving this. When this
> is not so, we have the --remote-encoding switch.
This is wget. The user is recursively downloading a file hierarchy.
Only after downloading does it become clear what one has got.
I download a collection of East Asian texts on some topic.
Upon examination, part is in SJIS, part in Big5, part in EUC-JP,
part in UTF-8. Since the downloaded stuff does not have a uniform
character set, and surely the server is not going to specify
character sets, any invocation of iconv will corrupt my data.
When I get the unmodified data I look using browser or editor
or xterm+luit for which character set setting I get readable text.
> > It would be terrible if wget decided to use obscure heuristics to
> > invent a remote character set and then invoke iconv.
>
> But what you suggest instead -- create a file name whose bytes are an
> exact copy of the remote -- is just another heuristic.
No. An exact copy allows me to decide what I have.
Conversion leads to data loss.
Andries
- Re: [Bug-wget] bad filenames (again), (continued)
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/17
- Re: [Bug-wget] bad filenames (again), Tim Ruehsen, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Tim Ruehsen, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again),
Andries E. Brouwer <=
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Andries E. Brouwer, 2015/08/18
- Re: [Bug-wget] bad filenames (again), Eli Zaretskii, 2015/08/18