[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] How to crawl multiple URLs enlisted in a file using sing
From: |
Avinash |
Subject: |
Re: [Bug-wget] How to crawl multiple URLs enlisted in a file using single wget connection ? |
Date: |
Wed, 30 Jun 2010 07:51:40 +0530 |
On Tue, Jun 29, 2010 at 11:55 PM, Micah Cowan <address@hidden> wrote:
> On 06/29/2010 03:36 AM, Avinash wrote:
> > Hi All,
> >
> > I have a file with hundreds of URLs present in it. One per line.
> > I am using -i option of wget to crawl all these URLs.
> >
> > But it seems that wget is creating a new connection per URL.
>
> Are these primarily HTTP URLs, or FTP URLs? If HTTP, do the host names
> differ in any way?
>
All these are HTTP/HTTPS URLs and hostnames may differ from URL to URL.
>
> I believe this is a known issue for FTP URLs, and may be due to the way
> the FTP module was designed (I may be wrong, but IIRC the FTP module
> doesn't know how to reuse connections except in a straight recursion).
> This could be fixed, but just hasn't been a top priority.
>
> The HTTP module certainly knows how to reuse links, so further
> investigation might be required if that's the case.
>
Thanks, I will do the further investigation. Now atleast I know that
>
> --
> Micah J. Cowan
> http://micah.cowan.name/
>
--
-Avinash