bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] wget downloads index.html unnecessarily and halts batch s


From: Ángel González
Subject: Re: [Bug-wget] wget downloads index.html unnecessarily and halts batch script (Windows)
Date: Wed, 23 Sep 2015 01:04:18 +0200
User-agent: Thunderbird

El 22/09/15 19:10, El Gato escribió:
Hi, everyone.

I am having trouble with wget64 on Windows. I am using a batch script to download files from a host:

@echo OFF
FOR /L %%i in (1, 1, 9999) DO (
    cls
    echo Downloading file %%i
wget64.exe -A pdf,chm -e robots=off --progress=bar --show-progress -r -np -nd -nc -HDit-ebooks.info,filepi.com --content-disposition -a wget.log it-ebooks.info/book/%%i/
)

|wget| will download |index.html| (which I feel is unnecessary), then it proceeds to the hosted file and downloads it if the file does not exist on the destination, but will fail to retrieve the |index.html| of the next book and start the next download.

Is it really necessary to download |index.html| and if that is the case, how can I tell |wget| to erase and download the new one every time?
It should be downloading then deleting it, since you are only accepting pdf and chm files (it downloads index.html for looking for the files). And that's what it does here.

As a bit of unwanted help, I would recommend printing the urls (replace the for contents with an echo) and loading the list with wget -i - This way wget will be able to reuse the opened connection instead of running 10000 instances (and connecting to the server 10000 times).





reply via email to

[Prev in Thread] Current Thread [Next in Thread]