Re: [Bug-wget] wget with the -i option.

From: Micah Cowan
Subject: Re: [Bug-wget] wget with the -i option.
Date: Wed, 28 Apr 2010 06:28:22 -0700
User-agent: Thunderbird (X11/20100317)

Comments below.

Ray Sterner wrote:
>   Problem using wget with the -i option
>   -------------------------------------
>   I don't think this is a bug but I have been trying to find a solution
>   for some time and have not been able to.  I'm hoping there will be an
>   option that I am overlooking or misunderstanding.
>   The problem is that I am trying to download a set of files from an ftp
>   site.  The files are Ocean Color related data from Goddard.  I have a
>   small test area set up to generate test files, but the real case will
>   have a lot more files.  New files appear on the ftp server as satellites
>   collect the data.  When the processing system is working most of the files
>   will have been downloaded and only new ones will be needed.
>   I can get all the files on the site with a command like
>         wget -rnd ftp://xyz 
>   in about 1.5 minutes.
>   If I put the URLs of all the files in the text file download.txt and try
>         wget -rnd -i download.txt
>   it gets the first two files and hangs on the third.
>   This site only allows two connections at a time from a host, so that must
>   be why two files are no problem.
>   I can get all the files to download using a command like
>         wget -rnd --timeout 2 -w 5 -i download.txt
>   but that takes about 1.5 hours.
>   Since getting all the files is fast I know it is possible to do.
>   What am I missing?
>   Can I tell wget that only two connections at a time are allowed
>   for the ftp site?

Actually, wget only ever opens one connection to a given host at one
time; it doesn't support "accelerated downloads" or that sort of thing.

However, I think it may currently suffer from closing and reopening the
connection, for each individual line in the file. It may be that the
server disallows this sort of behavior as well, or that it hasn't
finished shutting down the first couple connections before wget starts
the third.

I'm afraid I don't know of an obvious workaround for your problem.

Micah J. Cowan

