bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [PATCH] New option: --rename-output: modify output filena


From: Andrew Cady
Subject: Re: [Bug-wget] [PATCH] New option: --rename-output: modify output filename with perl
Date: Tue, 30 Jul 2013 16:19:52 -0400
User-agent: Mutt/1.5.20 (2009-06-14)

On Tue, Jul 30, 2013 at 11:24:56AM +0200, Tim Ruehsen wrote:
> On Friday 26 July 2013 14:30:00 Andrew Cady wrote:
>
> > This would be useful and would not be hard to do.  For example, instead
> > of printing just hstat.local_file to the pipe, you could serialize the
> > whole hstat struct (say as JSON).
> > 
> > However, the big issue is that you would need to define an interface
> > for the communication, and then you need to keep that interface
> > stable.  Maintaining such an interface would inevitably restrain future
> > development.  So, that requires some decision-making and policy.  Or
> > else the interface could be unstable, but then you have the same issue
> > as parsing the debugging output.
> > 
> > Niwt apparently uses "an HTTP-based protocol" to communicate between
> > plugins.
> 
> Any protocol has it's pros and cons. So why not doing it the same/similar way 
> as Micah does ? That seems to be intuitive - dumping the original HTTP 
> headers 
> and add your extension (e.g. 'X-Wget-Filename: directory/filename').

Oh, but I don't think this is any kind of _alternative_ to the
--rename-output option.  It would have to be a separate, _additional_
option.

The reason is that the flexibility of a protocol here just makes it much
harder to write the hook.  I want to be able to write a command line
like:

  --rename-output='s{^[^/]*/}{}'

Or at worst:

  --renamer-program='perl -lpe "BEGIN{\$|++} s{^[^/]*/}{}"'

...and not have to do something like:

  --transform-request='use HTTP_Header_Parser; $parser = new 
HTTP_Header_Parser; $|++; while (!eof(STDIN)) { %headers = 
$parser->parse(\*STDIN); $headers{local_file} =~ s{^[^/]*/}{}; 
$parser->dump(\%headers); }'

...and of course that's not even a real module.  If I wanted to do it
for real, I would have to look up the manual page for some HTTP parsing
module (and it probably wouldn't make it that easy to grab one header
from STDIN).  I don't want to have to read a man page just to specify
output filenames.

In reality, the latter style is just not suitable to one-liners.  You
have to write some script in an editor with syntax highlighting, save
it, and then specify the name.  And realistically you'll have to test it
before using it because it's just long enough to get something wrong.

Furthermore, it would be impossible to use GNU standard 'sed' to parse
HTTP headers.  You need a real parser for that.  So, we don't want to do
away with the simpler, more specific case, just because a more general
mechanism is possible.

> An additional Version: header as the first line to interpret makes even a 
> radical protocol change possible (instead the program could be called with a 
> --protocol-version command-line param).

That doesn't solve any problem for developers though.  They're still
forced to maintain the old interface.  It's only harder if multiple
interface versions are supported.  Furthermore, you're pushing even more
complexity into the hook here.  Realistically, you don't use protocol
versions for something like this; you just keep the original headers
you have forever, and only ever change things by adding new ones.  So,
the developer, once adding support for a header, must never remove it.
(NB.: version numbers don't change that.)  That makes committing to a
header a decision which constrains future development.

> Maybe you wait for a GO from Guiseppe...

For the record, I don't intend to implement the more general option.  I
affirm that it would be useful, but that is as far as it goes from me!



reply via email to

[Prev in Thread] Current Thread [Next in Thread]