bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] segfault encountered after HUGE recursive scrape


From: Tim Ruehsen
Subject: Re: [Bug-wget] segfault encountered after HUGE recursive scrape
Date: Mon, 09 Mar 2015 15:08:32 +0100
User-agent: KMail/4.14.2 (Linux/3.16.0-4-amd64; KDE/4.14.2; x86_64; ; )

Hi Gabriel,

> wget: convert.c:928: register_redirection: Assertion `file != ((void *)0)'

The current line number of the assertion is somewhere else.
But since 07a350d30c062a813a9ac2a6b3cd8b2ae07f0b26 convert.c hasn't been 
touched... Please check your wget version with wget --version.
(Did you use wget with the path to your self compiled executable ?)

I remember we fixed a redirection assertion bug a while before 
07a350d30c062a813a9ac2a6b3cd8b2ae07f0b26.

Tim

On Monday 09 March 2015 09:08:29 Gabriel L. Somlo wrote:
> Hi,
> 
> I was trying to recursively pull down a list of cca. 160 web sites at
> recursion depth 2, for web-in-a-box project in an isolated training
> environment.
> 
> The command line was:
> 
> wget -rpEHNk -e robots=off --random-wait -t 2 -U mozilla -l 2 <site-list>
> 
> I was using git commit 07a350d30c062a813a9ac2a6b3cd8b2ae07f0b26 (a few more
> commits were made since, but this thing ran for about three weeks before
> segfaulting with an assert).
> 
> The last few lines to stdout/stderr were:
> 
> ...
> --2015-03-05 19:51:42-- 
> http://www.mozilla.org/media/fonts/OpenSans-ExtraBoldItalic-webfont.eot
> Connecting to www.mozilla.org|63.245.217.105|:80... connected.
> HTTP request sent, awaiting response... 301 Moved Permanently
> Location:
> https://www.mozilla.org/media/fonts/OpenSans-ExtraBoldItalic-webfont.eot
> [following] --2015-03-05 19:51:42-- 
> https://www.mozilla.org/media/fonts/OpenSans-ExtraBoldItalic-webfont.eot
> Connecting to www.mozilla.org|63.245.217.105|:443... connected.
> HTTP request sent, awaiting response... 200 OK
> Length: 123774 (121K) [application/vnd.ms-fontobject]
> Server file no newer than local file
> ‘./var_www_topgen/www.mozilla.org/media/fonts/OpenSans-ExtraBoldItalic-webf
> ont.eot’ -- not retrieving.
> 
> wget: convert.c:928: register_redirection: Assertion `file != ((void *)0)'
> failed. Aborted (core dumped)
> 
> 
> The back trace looks like this:
> 
> (gdb) bt
> #0  0x00007fe7506cb8c7 in raise () from /lib64/libc.so.6
> #1  0x00007fe7506cd52a in abort () from /lib64/libc.so.6
> #2  0x00007fe7506c446d in __assert_fail_base () from /lib64/libc.so.6
> #3  0x00007fe7506c4522 in __assert_fail () from /lib64/libc.so.6
> #4  0x00000000004078f5 in register_redirection (
>     from=0xa0968ea80
> "http://www.mozilla.org/media/fonts/OpenSans-ExtraBoldItalic-webfont.eot";,
> to=0xa0b00f6f0
> "https://www.mozilla.org/media/fonts/OpenSans-ExtraBoldItalic-webfont.eot";)
> at convert.c:928 #5  0x00000000004311ab in retrieve_url
> (orig_parsed=0x99bc1c8e0,
>     origurl=0xa0968ea80
> "http://www.mozilla.org/media/fonts/OpenSans-ExtraBoldItalic-webfont.eot";,
> file=0x7fff5e0f89e8, newloc=0x7fff5e0f89d0, refurl=0x9b14ed020
> "http://www.mozilla.org/tabzilla/media/css/tabzilla.css";,
> dt=0x7fff5e0f89dc, recursive=false, iri=0x67e400 <dummy_iri>,
> register_status=true) at retr.c:949
> #6  0x000000000042da3e in retrieve_tree (start_url_parsed=0x239ae30, pi=0x0)
> at recur.c:301
> #7  0x0000000000429f71 in main (argc=182, argv=0x7fff5e0f9298) at
> main.c:1691
> 
> 
> Under normal circumstances, I'd be debugging and learning about the source
> code layout at the same time, and trying to figure out what the problem
> might be on my own.
> 
> However, given that it took over 3 weeks of run time before I hit the
> problem (meanwhile pulling down cca. 500Gb of material, and resulting in
> a 42Gb core file, I'd like to start by asking someone more familiar with
> the source tree for their best guess as to what this might be.
> 
> The machine I was using has 72Gb RAM, runs Fedora21, and this was the
> only job running. I'm wondering if low memory could have had something
> to do with it, although there's nothing in the logs to indicate that
> might have happened.
> 
> Thanks much for any suggestions,
> --Gabriel




reply via email to

[Prev in Thread] Current Thread [Next in Thread]