[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Segfault on Converting Links
From: |
Ángel González |
Subject: |
Re: [Bug-wget] Segfault on Converting Links |
Date: |
Thu, 31 May 2012 23:02:44 +0200 |
User-agent: |
Thunderbird |
On 31/05/12 20:54, Preston Maness wrote:
> Greetings,
>
> I first ran into this with my locally installed version of wget
> (1.13.4) while attempting to archive a wordpress website. I then
> compiled the latest development version
Thank you very much for your detailed report.
> $ gdb ./wget
> ...
> (gdb) set args -d -o debug.log --html-extension --page-requisites -k
> -e robots=off --exclude-directories=wiki,forums --reject
> "*action=print" -w 1 --random-wait --warc-file=cpr-wp-debug
> http://www.cyberpunkreview.com/movie/upcoming-movies/initial-impressions-review-of-solid-state-society/
> (gdb) run
An even easier test-case:
wget --convert-links
"http://www.cyberpunkreview.com/movie/upcoming-movies/initial-impressions-review-of-solid-state-society/"
> However, I have no idea where to go from here. I've filed a bug as
> well with the log file and some gdb commands that I believe show a
> null pointer dereference. The pointer "u" in convert.c is set to a
> value of "0x0" at the time the program crashes:
>
> convert.c:
>
> (126) u = url_parse (cur_url->url->url, NULL, pi, true);
> (127) local_name = hash_table_get (dl_url_file_map, u->url);
>
> The bug is located here: http://savannah.gnu.org/bugs/index.php?36570
The page contains http://[http://mlmlead.iphorum.com/]/, which is an invalid
url. url_parse can return null in case there's an error parsing the url.
convert.c is buggy assuming it will always suceed, and is thus segfaulting.
See fix below.
>From 9f3182017c16769b56a17bf70878fd566c1c6f79 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?=C3=81ngel=20Gonz=C3=A1lez?= <address@hidden>
Date: Thu, 31 May 2012 22:57:41 +0200
Subject: [PATCH] fix segfault on wrong urls (bug 36570)
---
ChangeLog | 4 ++++
src/convert.c | 5 +++++
2 files changed, 9 insertions(+)
diff --git a/ChangeLog b/ChangeLog
index aa249b0..2f0f965 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,7 @@
+2012-05-31 Ángel González <address@hidden>
+
+ * convert.c: fix segfault on wrong urls (bug 36570)
+
2012-05-13 Giuseppe Scrivano <address@hidden>
* bootstrap.conf (gnulib_modules): Add `git-version-gen'.
diff --git a/src/convert.c b/src/convert.c
index e1c58e9..3e10710 100644
--- a/src/convert.c
+++ b/src/convert.c
@@ -124,6 +124,11 @@ convert_links_in_hashtable (struct hash_table
*downloaded_set,
set_uri_encoding (pi, opt.locale, true);
u = url_parse (cur_url->url->url, NULL, pi, true);
+ if (!u)
+ {
+ continue;
+ }
+
local_name = hash_table_get (dl_url_file_map, u->url);
/* Decide on the conversion type. */
--
1.7.10.2