[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] [PATCH] Add option to write URL rejections to a tab-delim
From: |
Giuseppe Scrivano |
Subject: |
Re: [Bug-wget] [PATCH] Add option to write URL rejections to a tab-delimited CSV log. |
Date: |
Thu, 06 Aug 2015 08:16:26 +0200 |
User-agent: |
Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) |
Jookia <address@hidden> writes:
> * main.c: Add "--rejected-log" option.
> * init.c: Add "rejectedlog" command.
> * options.h: Add "rejected_log" parameter string.
> * wget.texi: Add brief documentation on new --rejected-log option.
> * recur.c: Optionally log details of URLs not traversed.
> Add reject_reason enum.
> (download_child_p -> download_child): Return a reject_reason.
> (descend_redirect_p -> descend_redirect): Return a reject_reason.
> (retrieve_tree): Support logging reasons for rejection.
> Add write_reject_log_header that writes a CSV format header to a file.
> Add write_reject_log_url that writes a url struct to a file in CSV format.
> Add write_reject_log_reason that writes the URL and parent URL as well as
> the
> rejection reason to a CSV file.
> * Test--rejected-log.px: Add a basic test for the --rejected-log command.
> * tests/Makefile.am: Run Test--rejected-log.px.
>
> This allows you to figure out why URLs are being rejected and some context
> around it. CSV is used as the output format since it can be used easily
> parsed,
> it's delimited by tabs instead of commas to allow using all (quoted) URL
> characters and includes column names which may be used for compatibility.
> ---
> doc/wget.texi | 5 ++
> src/init.c | 2 +
> src/main.c | 3 +
> src/options.h | 2 +
> src/recur.c | 189
> ++++++++++++++++++++++++++++++++++++--------
> tests/Makefile.am | 1 +
> tests/Test--rejected-log.px | 138 ++++++++++++++++++++++++++++++++
> 7 files changed, 308 insertions(+), 32 deletions(-)
> create mode 100755 tests/Test--rejected-log.px
Thanks for the patch, I have nothing against it, if nobody complains
before then I am going to push it later today.
I forgot to ask you about writing a Python test instead of
tests/Test--rejected-log.px, under testenv/ as our plan to get rid of
the old Perl tests suite under tests/. If you have some extra time
would you please do that as a separate patch?
Regards,
Giuseppe
- Re: [Bug-wget] [PATCH] Add option to write URL rejections to a tab-delimited CSV log.,
Giuseppe Scrivano <=