bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] [bug #20398] Save a list of the links that were not followed


From: Giuseppe Scrivano
Subject: [Bug-wget] [bug #20398] Save a list of the links that were not followed
Date: Mon, 10 Aug 2015 11:46:11 +0000
User-agent: Mozilla/5.0 (X11; Fedora; Linux x86_64; rv:39.0) Gecko/20100101 Firefox/39.0

Update of bug #20398 (project wget):

             Assigned to:               tdlewis77 => gscrivano              
             Open/Closed:                    Open => Closed                 
         Planned Release:                  1.12.x => None                   

    _______________________________________________________

Follow-up Comment #7:

Fixed upstream with:

commit e4db00d74d7c8ade43e57f39344d8505d607308a
Author: Jookia <address@hidden>
Date:   Fri Jul 31 23:41:36 2015 +1000

    Add option to write URL rejections to a tab-delimited CSV log.
    
     * main.c: Add "--rejected-log" option.
     * init.c: Add "rejectedlog" command.
     * options.h: Add "rejected_log" parameter string.
     * wget.texi: Add brief documentation on new --rejected-log option.
     * recur.c: Optionally log details of URLs not traversed.
       Add reject_reason enum.
       (download_child_p -> download_child): Return a reject_reason.
       (descend_redirect_p -> descend_redirect): Return a reject_reason.
       (retrieve_tree): Support logging reasons for rejection.
       Add write_reject_log_header that writes a CSV format header to a file.
       Add write_reject_log_url that writes a url struct to a file in CSV
format.
       Add write_reject_log_reason that writes the URL and parent URL as well
as the
       rejection reason to a CSV file.
     * Test--rejected-log.px: Add a basic test for the --rejected-log
command.
     * tests/Makefile.am: Run Test--rejected-log.px.
    
    This allows you to figure out why URLs are being rejected and some
context
    around it. CSV is used as the output format since it can be used easily
parsed,
    it's delimited by tabs instead of commas to allow using all (quoted) URL
    characters and includes column names which may be used for compatibility.


    _______________________________________________________

Reply to this item at:

  <http://savannah.gnu.org/bugs/?20398>

_______________________________________________
  Messaggio inviato con/da Savannah
  http://savannah.gnu.org/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]