bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Miscellaneous thoughts & concerns


From: Jeffrey Fetterman
Subject: [Bug-wget] Miscellaneous thoughts & concerns
Date: Fri, 6 Apr 2018 16:30:57 -0500

Thanks to the fix that Tim posted on gitlab, I've got wget2 running just
fine in WSL. Unfortunately it means I don't have TCP Fast Open, but given
how fast it's downloading a ton of files at once, it seems like it must've
been only a small gain.


I've come across a few annoyances however.

1. There doesn't seem to be any way to control the size of the download
queue, which I dislike because I want to download a lot of large files at
once and I wish it'd just focus on a few at a time, rather than over a
dozen.

3. Doing a TLS resume will cause a 'Failed to write 305 bytes (32: Broken
pipe) error to be thrown', seems to be related to how certificate
verification is handled upon resume, but I was worried at first that the
WLS problems were rearing their ugly head again.

3. --no-check-certificate causes significantly more errors about how the
certificate issuer isn't trusted to be thrown (even though it's not
supposed to be doing anything related to certificates).

4. --force-progress doesn't seem to do anything despite being recognized as
a valid paramater, using it in conjunction with -nv is no longer beneficial.

5. The documentation is unclear as to how to disable things that are
enabled by default. Am I to assume that --robots=off is equivalent to -e
robots=off?

6. The documentation doesn't document being able to use 'M' for chunk-size,
e.g. --chunk-size=2M

7. The documentation's instructions regarding --progress is all wrong.

8. The http/https proxy options return as unknown options despite being in
the documentation.


Lastly I'd like someone to look at the command I've come up with and offer
me critiques (and perhaps help me address some of the remarks above if
possible).

#!/bin/bash

wget2 \
      `#WSL compatibility` \
      --restrict-file-names=windows --no-tcp-fastopen \
      \
      `#No certificate checking` \
      --no-check-certificate \
      \
      `#Scrape the whole site` \
      --continue --mirror --adjust-extension \
      \
      `#Local viewing` \
      --convert-links --backup-converted \
      \
      `#Efficient resuming` \
      --tls-resume --tls-session-file=.\tls.session \
      \
      `#Chunk-based downloading` \
      --chunk-size=2M \
      \
      `#Swiper no swiping` \
      --robots=off --random-wait \
      \
      `#Target` \
      --domains=example.com example.com


reply via email to

[Prev in Thread] Current Thread [Next in Thread]