bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] [PATCH] Add option to write URL rejections to a tab-delim


From: Giuseppe Scrivano
Subject: Re: [Bug-wget] [PATCH] Add option to write URL rejections to a tab-delimited CSV log.
Date: Thu, 06 Aug 2015 08:16:26 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux)

Jookia <address@hidden> writes:

>  * main.c: Add "--rejected-log" option.
>  * init.c: Add "rejectedlog" command.
>  * options.h: Add "rejected_log" parameter string.
>  * wget.texi: Add brief documentation on new --rejected-log option.
>  * recur.c: Optionally log details of URLs not traversed.
>    Add reject_reason enum.
>    (download_child_p -> download_child): Return a reject_reason.
>    (descend_redirect_p -> descend_redirect): Return a reject_reason.
>    (retrieve_tree): Support logging reasons for rejection.
>    Add write_reject_log_header that writes a CSV format header to a file.
>    Add write_reject_log_url that writes a url struct to a file in CSV format.
>    Add write_reject_log_reason that writes the URL and parent URL as well as 
> the
>    rejection reason to a CSV file.
>  * Test--rejected-log.px: Add a basic test for the --rejected-log command.
>  * tests/Makefile.am: Run Test--rejected-log.px.
>
> This allows you to figure out why URLs are being rejected and some context
> around it. CSV is used as the output format since it can be used easily 
> parsed,
> it's delimited by tabs instead of commas to allow using all (quoted) URL
> characters and includes column names which may be used for compatibility.
> ---
>  doc/wget.texi               |   5 ++
>  src/init.c                  |   2 +
>  src/main.c                  |   3 +
>  src/options.h               |   2 +
>  src/recur.c                 | 189 
> ++++++++++++++++++++++++++++++++++++--------
>  tests/Makefile.am           |   1 +
>  tests/Test--rejected-log.px | 138 ++++++++++++++++++++++++++++++++
>  7 files changed, 308 insertions(+), 32 deletions(-)
>  create mode 100755 tests/Test--rejected-log.px

Thanks for the patch, I have nothing against it, if nobody complains
before then I am going to push it later today.

I forgot to ask you about writing a Python test instead of
tests/Test--rejected-log.px, under testenv/ as our plan to get rid of
the old Perl tests suite under tests/.  If you have some extra time
would you please do that as a separate patch?

Regards,
Giuseppe



reply via email to

[Prev in Thread] Current Thread [Next in Thread]