[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Bug-wget] Some possible Inconsistencies in WGET 1.15
From: |
Halliday, Andrew |
Subject: |
[Bug-wget] Some possible Inconsistencies in WGET 1.15 |
Date: |
Wed, 23 Jul 2014 03:10:27 +0000 |
Hi,
I've recently undertaken an exercise to map the command line switches of WGET
1.15 alongside the commands available in the config file I specify with
--config=FILE. See attached file for detail.
In doing this, I've noticed that some command line switches don't have matching
commands for the file:
-a, --append-output=FILE append messages to FILE.
--report-speed=TYPE Output bandwidth as TYPE. TYPE can be bits.
--unlink remove file before clobber.
--method=HTTPMethod use method "HTTPMethod" in the header.
--body-data=STRING Send STRING as data. --method MUST be set.
--body-file=FILE Send contents of FILE. --method MUST be set.
--content-on-error output the received content on server errors.
--https-only only follow secure HTTPS links
--preserve-permissions preserve remote file permissions.
--accept-regex=REGEX regex matching accepted URLs.
--reject-regex=REGEX regex matching rejected URLs.
--regex-type=TYPE regex type (posix).
--warc-file=FILENAME save request/response data to a .warc.gz file.
--warc-header=STRING insert STRING into the warcinfo record.
--warc-max-size=NUMBER set maximum size of WARC files to NUMBER.
--warc-cdx write CDX index files.
--warc-dedup=FILENAME do not store records listed in this CDX file.
--no-warc-compression do not compress WARC files with GZIP.
--no-warc-digests do not calculate SHA1 digests.
--no-warc-keep-log do not store the log file in a WARC record.
--warc-tempdir=DIRECTORY location for temporary files created by the WARC
writer.
I've also noticed that there are also some commands which are not available as
switches in the command line:
#dot_bytes = n
#dot_spacing = n
#dots_in_line = n
#netrc = on/off
#robots = on/off
#show_all_dns_entries = on/off
Just thought I might assist with some of those apparent inconsistencies
observed.
In particular, it would be nice to be able to:
* Ignore robots as a command line switch
* Apply regex to URLs in the command file
Hope this helps!
Andrew
-----------------------------------------------------------------------
This email, and any attachments, may be confidential and also privileged. If
you are not the intended recipient, please notify the sender and delete all
copies of this transmission along with any attachments immediately. You should
not copy or use it for any purpose, nor disclose its contents to any other
person.
-----------------------------------------------------------------------
WGetSampleSettings
Description: WGetSampleSettings
- [Bug-wget] Some possible Inconsistencies in WGET 1.15,
Halliday, Andrew <=