bug-wget
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Bug-wget] Detected bugs by bugprobe


From: Tim Ruehsen
Subject: Re: [Bug-wget] Detected bugs by bugprobe
Date: Mon, 23 Sep 2013 09:30:47 +0200
User-agent: KMail/4.10.5 (Linux/3.10-3-amd64; KDE/4.10.5; x86_64; ; )

On Friday 20 September 2013 21:32:23 Darshit Shah wrote:
> Hi Yan,
> 
> Thanks for the analysis.
> 
> The followings are some bugs found by our static analysis tool bugprobe in
> 
> > wget-1.14.
> > In file recur.c
> > At line 281, function url_parse may return NULL assigned to url_parsed.
> > Function retrieve_url use url_parsed without null check.
> > 
> > 
> > All program points that call function url_parse also exist this kind of
> > bugs.
> > 
> > I think this is already handled, since I tried making the code fail
> 
> through a few test cases, but couldn't. Even if url_parse returns a NULL
> value, wget exits gracefully with a meaningful error message.
> 
> I think I remember seeing some code that handled this issue, but I'm not
> sure exactly where. Probably, the issue isn't handled correctly, and there
> may be an edge case. Will have to look into the code thoroughly to check it
> out.

Even if Wget doesn't crash right now, there is a pitfall for contributors.
And in that, Yan Zhang's analyzer is right.

Just have a look at recur.c lines 278-281 (current git master):
          struct url *url_parsed = url_parse (url, &url_err, i, true);

          status = retrieve_url (url_parsed, url, &file, &redirected, referer,
                                 &dt, false, i, true);

url_parse() returns NULL on every problem it finds. This is by definition, not 
by accident.
BUT retrieve_url() does not check url_parsed for NULL and is likely to crash 
if that ever happens...

How should a static analyzer know, that Wget performs some checks somewhere 
else to make sure this situation does not happen ? How should a contributor 
oversee this situation ?

It is a matter of code quality to check the return value of url_parse() before 
it gets worked on / passed to a another function, and be it just to make the 
situation clearer. And btw, it helps avoiding analyzers to produce false 
positives.

Another point is that the above code obviously relies on url_parse() having 
been called earlier (uh, was that grammar correct ?)... so why not putting 
that result into a hashtable to avoid calling url_parse() two times ?

Tim




reply via email to

[Prev in Thread] Current Thread [Next in Thread]