bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Bug-wget] Race condition on downloaded files among multiple wget instan


From: Tomas Hozza
Subject: [Bug-wget] Race condition on downloaded files among multiple wget instances
Date: Tue, 3 Sep 2013 03:17:56 -0400 (EDT)

Hello.

In Fedora I have a bug [1] from guy that is using wget
to test web server network load. He runs multiple 
instances of wget to download some site recursively.
Something like this:

for i in `seq 20`; do
    wget -r http://www.makerwise.com/ &
done

Some of those wget instances are killed with SIGBUS.
The problem seems to be that when wget downloads the
site, it tries to parse it, but in the meantime the
file is being rewritten by another wget. When parsing
the web page wget uses mmap() to map the file into the
memory. Unfortunately mmap behaviour is undefined if
the file is changed after mmap call. Also there is 
stat() call to get the file size before calling mmap,
but the file can be already changed (with different size)
when calling mmap.

I tried the fallback (if mmap is not available) code that
uses read() to get the file and it worked without problems
also if multiple instances of wget were run. Another problem
might be that read() is slower than mmap(), but from what I
tried it is not any noticeable difference.

Even though I think the described use case is a misuse of
wget I'm wondering if wget shouldn't use read() rather than
mmap() to prevent the crashes in such a situations.

Thanks!

Regards,

Tomas Hozza


[1] https://bugzilla.redhat.com/show_bug.cgi?id=999647



reply via email to

[Prev in Thread] Current Thread [Next in Thread]