bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Re: How to ignore errors with time stamping


From: Andre Majorel
Subject: Re: [Bug-wget] Re: How to ignore errors with time stamping
Date: Fri, 12 Dec 2008 14:21:11 +0100
User-agent: Mutt/1.5.17+20080114 (2008-01-14)

On 2008-12-12 12:21 +0100, Morten Lemvigh wrote:
> Andre Majorel wrote:
>
>> To work around that kind of brokenness, Wget would have to ignore
>> the 500 error and fall back on parsing the local file. That should
>> probably not be made the default behaviour, though.
>
> Ah, I see! Thank you for your answer. I guess I'll just have to
> script may way around it then...

Well, Micah may decide to add an option for that but apparently,
Wget is feature-frozen pending release 1.12.

You could try scripting something around the output of

  http://www.teaser.fr/~amajorel/misc/htmlhref

I wouldn't swear it's bug-free but it seems to work for me. Since
it derives the base URL from the pathname of the local file, you
want to call it from the same directory you ran wget -r from :

  wget -x http://eur-lex.europa.eu/JOHtml.do?uri=OJ:L:2008:321:SOM:DA:HTML &&
  htmlhref 'eur-lex.europa.eu/JOHtml.do?uri=OJ:L:2008:321:SOM:DA:HTML' |
    xargs wget -x -p -N

In your case, however, hacking Wget to ignore 500 after HEAD could
be the simplest solution. Have you looked at Curl ? Maybe it does
what you want.

-- 
André Majorel <URL:http://www.teaser.fr/~amajorel/>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]