bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] bad filenames (again)


From: Tim Ruehsen
Subject: Re: [Bug-wget] bad filenames (again)
Date: Fri, 07 Aug 2015 17:13:19 +0200
User-agent: KMail/4.14.2 (Linux/4.0.0-2-amd64; KDE/4.14.2; x86_64; ; )

On Friday 07 August 2015 16:38:01 Andries E. Brouwer wrote:
> On Fri, Aug 07, 2015 at 04:14:45PM +0200, Tim Ruehsen wrote:
> > Hi Andries,
> > 
> > as I already mentioned, changing the default behavior of wget is not a
> > good
> > idea.
> > 
> > But I started a wget2 branch that produces wget and wget2 executables.
> > wget2's default behavior is to keep filenames as they are.
> > 
> > I am not sure how it compiles and works on Windows (Cygwin could work).
> > If you dare to check it out: any feedback is highly welcome.
> > 
> > Regards, Tim
> 
> Hi Tim,
> 
> I disagree. This is just a bug.
> Nobody wants illegal filenames.
> Even removing them is not entirely trivial since the filenames
> produced by wget are not legal character sequences, so cannot be typed.

Hi Andries,

obviously I got it wrong.

If it's a bug, let's just fix it (without breaking compatibility).

I don't have the time to read *all* the old emails right now.
But as far as I understand escaping occurs within legal UTF-8 sequences - and 
you are right when saying this is a bug when we have a UTF-8 locale.

The solution would something like

if locale is UTF-8
  do not escape valid UTF-8 sequences
else
  keep wget's current behavior

If URLs (and thus filenames) are not in UTF-8, Wget will convert them to UTF-8 
before the above procedure (I guess that is what wget does anyways, well not 
100% sure).

Would you agree ?

If you provide patch for this we will appreciate that.

> I am a Linux man, no Windows computers here. So, I am happy to do
> stuff on Linux, but cannot test on Windows.

Sorry, won't bother you again regarding Windows ;-)

Tim




reply via email to

[Prev in Thread] Current Thread [Next in Thread]