[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] [Bulk] [PATCH] Add option to write URL rejections to a CS
Re: [Bug-wget] [Bulk] [PATCH] Add option to write URL rejections to a CSV log.
Tue, 28 Jul 2015 19:41:54 +1000
Mutt/1.5.23+102 (2ca89bed6448) (2014-03-12)
On Tue, Jul 28, 2015 at 10:24:29AM +0200, Gisle Vanem wrote:
> Jookia wrote:
> I've not tried your patch. But by reading it,
> >+static void write_url_csv (FILE* f, struct url *url)
> >+ if (!f)
> >+ return;
> Isn't this test superfluous? Already done by caller (?).
Yes, I'm unsure about where to put it. Not checking in write_url_csv seems
unsafe for sure but not checking in the caller gives the impression to the
reader that it's going to be written regardless of a file being open.
> I'd suggest the reject-log starts with a comment:
> FILE *rejectedlog = 0;
> if (opt.rejected_log)
> rejectedlog = fopen (opt.rejected_log, "w");
> if (!rejectedlog)
> logprintf (LOG_NOTQUIET, "%s: %s\n", opt.rejected_log, strerror
> fprintf (rejectedlog,"# Wget reject-log %s generated at --%s-- for
> datetime_str (time (NULL),
> <base-url, where is this stored?> ));
> But according to http://tools.ietf.org/html/rfc4180,
> it doesn't specify if a "# comment" is legal before the
> CSV header. I think most DB-apps do allow it though.
Overall I'm not sure what information could be gained with that.
I don't know if this would help with compatibility as people could be running
old versions of Wget in the future.
I'm also unsure of adding the base URL as it's possible to recursively download
from subdirectories in a large site without going up in the hierarchy.
Perhaps I'm missing the intention for this comment? :)
> Nice work.
Thanks, and thanks for your input!