|Subject:||[Bug-wget] Timeout switch not working|
|Date:||Sun, 30 Sep 2012 18:44:53 +0100|
|User-agent:||Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/20120907 Thunderbird/15.0.1|
Hi,I am having some trouble with a wget task that is performing an HTTPS download of a large file (a 2.9GB 7Zip archive) every night. It is scheduled to run as part of an SSIS package, using the SSIS Execute Process task. The wget arguments passed are as follows (I've replaced sensitive information with the "x" character):
--http-user=SMP\xxxxxxxx --http-passwd=xxxxxxxx --output-document=C:\DB_Downloads\xxxxxxxx.7z --output-file=C:\DB_Downloads\log.txt --continue --timeout=300 --tries=20 --no-check-certificate https://download.xxxxxxxx.net/xxxxxxxx/xxxxxxxx.7z
A typical scenario is for the SSIS package to start and the wget task starts logging information about the progress of the download. After some random interval (there is no discernible pattern in terms of time), the progress of the download will stop. For example, the download may start at 01:00 and stop at 01:07, as indicated by the modified date/time of the log file. Despite the timeout switch telling it to give up and try again after 5 minutes of no data received, this never happens. The wget task has been observed to run for hours (e.g. until after 9am) without progressing the download by a single byte and also not timing out or reporting any kind of error. The only way I can force a restart is to put a timeout setting on the SSIS Execute Process task and use SQL Server Agent to perform a TASKKILL on any active wget processes and restart the package. Invariably, when a new instance of the SSIS package starts, the download continues from where it left off and carries on, perhaps to completion, perhaps not.
I have run a number of ping tests over recent nights, which reveal that there is no significant loss of connectivity with the HTTPS server.
I am able to achieve a successful download of the file if I configure enough SQl Server Agent job steps to kill wget and launch another instance of the SSIS package. Sometimes the file will download successfully with one or two runs of the SSIS package, sometimes it needs up to five attempts. SSIS is currently configured to allow 90 minutes for the download and the client site has a 10Mbps leased line to the Internet. There's no way to predict how much of the 90 minutes will be spent actively progressing the download and how many will be spent idle. For example, download attempt 1 may last for 30 minutes then become idle, download attempt 2 may last for the full 90 minutes, download attempt 3 may last for just 1 minute, then download attempt 4 may successfully complete the download with further 20 minutes.
I am using the Win32 wget.exe from SourceForge and it's been working fine for the best part of two years. I think something has changed on the side of the download server but this is the domain of a third party. I've wondered if they have recently put in some kind of IPS that is randomly blocking the data packets but this is just speculation and the third party involved claim nothing has changed on their side. My question is: why does the timeout switch not help in this scenario and is this therefore a bug?
I have log files (with debug output) available, as well as the wgetrc file if this is useful. Any help or advice will be much appreciated.
Thanks, Ian --
|[Prev in Thread]||Current Thread||[Next in Thread]|