bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Help: 'wget --page-requisites' is slow


From: David Bodin
Subject: [Bug-wget] Help: 'wget --page-requisites' is slow
Date: Sat, 15 Jun 2019 15:49:03 -0700

Hello wget community,

*Goal*
My goal is to download a single webpage to be fully functional offline in
the same time it takes a browser to request and show the page.

*Problem*
The following command downloads a page and makes it fully functional
offline, but it takes approximately 35 seconds where the browser requests
and shows the page in about 5 seconds. Can someone please help me
understand why my *wget* command is taking *so much longer* and how I can
make it faster? Or is there any locations or chat groups where I can seek
help? Sincere thanks in advance for any help anyone can provide.

*wget --page-requisites --span-hosts --convert-links --adjust-extension
--execute robots=off --user-agent Mozilla --random-wait
https://www.invisionapp.com/inside-design/essential-steps-designing-empathy/
<https://www.invisionapp.com/inside-design/essential-steps-designing-empathy/>*

*More info & attempted solutions*

   1. I removed '*--random-wait*' because I thought it might be adding time
   for each file request, but this did nothing.
   2. I thought the https protocol might slow it down with extra calls back
   and forth for each file so I added '*--no-check-certificate*', but this
   did nothing.
   3. I read there could be an issue with IPv6 so I added '*--inet4-only*',
   but this did nothing.
   4. I read the dns could slow things down so I added '*--no-dns-cache*',
   but this did nothing.
   5. I thought perhaps *wget* was downloading the assets sequentially one
   at a time so I tried to run multiple commands concurrently with between 3
   and 16 threads/processes by removing '*--convert-links*' adding '
   *--no-clobber*' in the hopes that with multiple files would be
   downloaded at the same time and after all files were downloaded that I
   could run the command again removing '*--no-clobber*' and '
   *--page-requisites*' and adding '*--convert-links*' to make it fully
   functional offline. but this did nothing. I also thought that multiple
   threads would speed things up because it would remove the latency of the
   https checks by doing multiple at a time, but I didn't observe this.
   6. I read an article about running the command as root user in case
   there were any limits on a given user, but this did nothing.

Sincere thanks in advance, again,
Dave


reply via email to

[Prev in Thread] Current Thread [Next in Thread]