[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] bad filenames (again)

From: Andries E. Brouwer
Subject: Re: [Bug-wget] bad filenames (again)
Date: Tue, 18 Aug 2015 19:51:58 +0200
User-agent: Mutt/1.5.21 (2010-09-15)

On Tue, Aug 18, 2015 at 07:43:05PM +0300, Eli Zaretskii wrote:

> > > If we convert the file names using iconv, Windows users will also be
> > > happier, at least when the remote URL can be encoded in their system
> > > codepage.
> > 
> > Windows does not differ from Unix - since the remote character set
> > is unknown and not necessarily constant, a conversion is impossible.
> Windows does differ from Unix, in that arbitrary byte sequences cannot
> be used in file names.

Of course. The code already tries to take care of that.

>  See
> https://msdn.microsoft.com/en-us/library/windows/desktop/aa365247%28v=vs.85%29.aspx
> for the gory details.

Thanks for the reference!

> > I already indicated the 1-line change that fixes the Windows problems.
> It doesn't, unfortunately.

You are too brief. What is wrong with the change that changes
    /* insert some test for Windows */
    return true;

That change only changes what wget does with bytes in the 128-159 range,
and reading the gory details I fail to see any problem. Almost the opposite:
"Use any character in the current code page for a name, including Unicode 
 and characters in the extended character set (128–255)"
At first sight, if there were a problem it would be because of the clause
"Any other character that the target file system does not allow".

Thanks to your reference I now feel confident to make that 1-line change
so that also Windows users are happy.


(There are restrictions involving filenames that wget perhaps does not enforce:
no LPT3, no final space or period, ... It might be useful to teach wget about
such details.)

reply via email to

[Prev in Thread] Current Thread [Next in Thread]