bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] How to crawl multiple URLs enlisted in a file using singl


From: Micah Cowan
Subject: Re: [Bug-wget] How to crawl multiple URLs enlisted in a file using single wget connection ?
Date: Tue, 29 Jun 2010 11:25:28 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4

On 06/29/2010 03:36 AM, Avinash wrote:
> Hi All,
> 
> I have a file with hundreds of URLs present in it. One per line.
> I am using -i option of wget to crawl all these URLs.
> 
> But it seems that wget is creating a new connection per URL.

Are these primarily HTTP URLs, or FTP URLs? If HTTP, do the host names
differ in any way?

I believe this is a known issue for FTP URLs, and may be due to the way
the FTP module was designed (I may be wrong, but IIRC the FTP module
doesn't know how to reuse connections except in a straight recursion).
This could be fixed, but just hasn't been a top priority.

The HTTP module certainly knows how to reuse links, so further
investigation might be required if that's the case.

-- 
Micah J. Cowan
http://micah.cowan.name/



reply via email to

[Prev in Thread] Current Thread [Next in Thread]