bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] action on "not able to connect to proxy"


From: Mohan gupta
Subject: Re: [Bug-wget] action on "not able to connect to proxy"
Date: Thu, 15 Oct 2009 13:00:51 +0530

Sorry for writting this ,
 but its just to bring to your notice the below problem I really need
to get sorted out as soon as possible .
I havn't got any reply to my question in the previous post of this
thread , if possible please see to it and help me out .
Thanks and regards
Mohan Gupta

On Tue, Oct 13, 2009 at 12:39 PM, Mohan gupta <address@hidden> wrote:
> sorry for sending a confusing message....
>
> What i meant was that , what actually timeout retries means , say i
> give a url of a remote site and that url is no longer online yielding
> something like "page not found " error...in that case  i believe wget
> will retry for "maxtries" time as set in the config file ..and after
> that it will blacklist this url and will move to the next in the
> queue. thats absolutely fine .
>
> But consider a case where I have a url queue ( with lots of url) ,now
> if the proxy server is unreachable will wget exit saying "unable to
> connect to proxy server" or it will like above make maxtries and will
> blacklist the urls one after the other.
>
> See there is difference between two cases, one the url is inactive and
> the other we are unable to connect to the internet in the first place.
>
> so whats the action in the second case?
> Mohan
>
> On 10/13/09, Micah Cowan <address@hidden> wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Mohan gupta wrote:
>>> hello everyone ,
>>> greetings on my first mail to the list!!
>>>
>>> Well I am using wget as a full webcrawler .Now I am behind my
>>> university proxy and i wanted to know what is the action taken by wget
>>> when the proxy server is off. I have configured it to use a proxy
>>> server and for retrieving urls it has been configured to make atmost 5
>>> attempts.
>>> In my system wget takes url from a database and retrieves them. When
>>> say proxy server is off i expect wget to exit() itself and not just
>>> pick url's from database and keep getting timedout  on them.
>>>
>>> Is it what it really does??
>>>
>>> I do not want wget to erroneously destroy my database of urls.!!
>>
>> How in the world can wget destroy anything? Wget doesn't even know about
>> your database of URLs. If you're just throwing them out after a download
>> attempt, without trying to actually verify the downloads, then how is
>> that Wget's fault?
>>
>> And, I'm unsure how you expect wget to behave one way when proxy is "on"
>> (timeout and try again), and another when it's "off". How is wget
>> supposed to know the difference? If the proxy is down, then ideally the
>> network should be configured to send back "No route to host" packets,
>> etc. But if wget doesn't receive any packets at all in response to
>> connection attempts, then how is it supposed to tell the difference
>> between a temporary network failure and a switched-off machine? You can
>> of course adjust wget's setting for how long it is willing to wait for a
>> timeout (which is currently quite liberal), but there is obviously no
>> way for it to understand that the machine is switched off, and that it
>> shouldn't bother continuing to try.
>>
>> - --
>> Micah J. Cowan
>> Programmer, musician, typesetting enthusiast, gamer.
>> Maintainer of GNU Wget and GNU Teseq
>> http://micah.cowan.name/
>> -----BEGIN PGP SIGNATURE-----
>> Version: GnuPG v1.4.9 (GNU/Linux)
>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>>
>> iEYEARECAAYFAkrTpwEACgkQ7M8hyUobTrFo6gCeNQCh5NTYnnPRMV/3KhN0DPIY
>> IT4AniwnQPNeZOef86c+MM7u6SmI6kTY
>> =/B25
>> -----END PGP SIGNATURE-----
>>
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]