bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Segfault on Converting Links


From: Ángel González
Subject: Re: [Bug-wget] Segfault on Converting Links
Date: Thu, 31 May 2012 23:02:44 +0200
User-agent: Thunderbird

On 31/05/12 20:54, Preston Maness wrote:
> Greetings,
> 
> I first ran into this with my locally installed version of wget
> (1.13.4) while attempting to archive a wordpress website. I then
> compiled the latest development version

Thank you very much for your detailed report.

> $ gdb ./wget
> ...
> (gdb) set args -d -o debug.log --html-extension --page-requisites -k
> -e robots=off --exclude-directories=wiki,forums --reject
> "*action=print" -w 1 --random-wait --warc-file=cpr-wp-debug
> http://www.cyberpunkreview.com/movie/upcoming-movies/initial-impressions-review-of-solid-state-society/
> (gdb) run

An even easier test-case:
 wget --convert-links 
"http://www.cyberpunkreview.com/movie/upcoming-movies/initial-impressions-review-of-solid-state-society/";

> However, I have no idea where to go from here. I've filed a bug as
> well with the log file and some gdb commands that I believe show a
> null pointer dereference. The pointer "u" in convert.c is set to a
> value of "0x0" at the time the program crashes:
> 
> convert.c:
> 
> (126)          u = url_parse (cur_url->url->url, NULL, pi, true);
> (127)          local_name = hash_table_get (dl_url_file_map, u->url);
> 
> The bug is located here: http://savannah.gnu.org/bugs/index.php?36570

The page contains http://[http://mlmlead.iphorum.com/]/, which is an invalid 
url. url_parse can return null in case there's an error parsing the url. 
convert.c is buggy assuming it will always suceed, and is thus segfaulting.

See fix below.


>From 9f3182017c16769b56a17bf70878fd566c1c6f79 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=C3=81ngel=20Gonz=C3=A1lez?= <address@hidden>
Date: Thu, 31 May 2012 22:57:41 +0200
Subject: [PATCH] fix segfault on wrong urls (bug 36570)

---
 ChangeLog     |    4 ++++
 src/convert.c |    5 +++++
 2 files changed, 9 insertions(+)

diff --git a/ChangeLog b/ChangeLog
index aa249b0..2f0f965 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2012-05-31  Ángel González <address@hidden>
+
+       * convert.c: fix segfault on wrong urls (bug 36570)
+
 2012-05-13  Giuseppe Scrivano  <address@hidden>
 
        * bootstrap.conf (gnulib_modules): Add `git-version-gen'.
diff --git a/src/convert.c b/src/convert.c
index e1c58e9..3e10710 100644
--- a/src/convert.c
+++ b/src/convert.c
@@ -124,6 +124,11 @@ convert_links_in_hashtable (struct hash_table 
*downloaded_set,
           set_uri_encoding (pi, opt.locale, true);
 
           u = url_parse (cur_url->url->url, NULL, pi, true);
+          if (!u)
+            {
+              continue;
+            }
+
           local_name = hash_table_get (dl_url_file_map, u->url);
 
           /* Decide on the conversion type.  */
-- 
1.7.10.2



reply via email to

[Prev in Thread] Current Thread [Next in Thread]