[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Bug-wget] Big refactoring
Re: [Bug-wget] Big refactoring
Tue, 07 Aug 2012 11:08:40 -0700
Mozilla/5.0 (X11; Linux x86_64; rv:13.0) Gecko/20120615 Thunderbird/13.0.1
(I've abridged and reordered the original post, for brevity's sake)
On 08/07/2012 08:12 AM, illusionoflife wrote:
> I really want to contribute to Wget, but I am afraid that such radical
> will not be accepted with
I think the maintainer is aware that Wget's code quality is poor, and
would welcome sweeping architectural changes; I know I would have, when
I was maintainer.
However, these sorts of changes probably have to be done with strong
oversight by the maintainer, to be certain that it's always in line with
what's wanted. It'd be very easy to go in a different direction than is
wanted. I'd say, check with Giuseppe as frequently as possible, and
submit patches on as granular a level as possible.
I'd also recommend starting from a point where as much functionality as
possible is covered by unit tests. This was a major reason for my
creating the ones we have, in the first place. That and the fact that no
one ever bothered to test wget until a release was made (despite
numerous test releases), so automated testing was vital. But I never got
around to taking it the next step, which was leveraging those tests to
restructure Wget's code. Actually, what I decided instead was to rewrite
it as a new beast entirely, which is part of why I'm no longer
maintaining Wget. :)
Then, at every stage, make sure all the tests are still working, so you
can be confident that by reorganizing it, you haven't broken anything
(extremely easy to do, given how tangled up much of the logic is).
> I fed wget source to GNU complexity and It suggested me to look at
> parse_netrc. So I came with some ideas -- they are not so fast to implement,
> so I want to discuss them, to not waste work.
To my mind, there are many more monstrous functions than parse_netrc.
And in parse_netrc's case, I'm guessing most of the code is fairly
single-minded. In cases such as gethttp or http_loop, those functions
"know" way too much about various subsystems, and take responsibility
for far too much, in addition to being simply behemoth in size.
> Also, I propose folloing idea -- move all utility functions to another
> library, internal one. Why? Because, for example
> while (*p && isspace(*p))
> is less readable, then `p = string_skip_while(p,isspace)`.
Not to my eye. I can instantly understand what the while-loop does,
whereas the function is a mystery to me until I actually inspect it. I
do not believe that one- or two-line functions add value (and they
usually detract), except in cases where the implementation might change,
and needs to be gathered to one spot. I do not think that scanning for a
category of character qualifies.
> Also, I have idea of using list.h and analog for single linked version. Why?
> Just again, because of code duplicate.
> Many algorithms, implemented in parse_netrc, should be moved to separated
> function, like reverse_list. But this function is way more generic, then
> (*)(acc_t *)`. Also, we can more actively use gl_list, but I find kernel
> solution way more elegant.
We're already inextricably tied to gnulib at this point, we might as
well take advantage of any facilities already offered to us there.
And I suspect gnulib will have a number of facilities that may help in
the refactoring effort anyway.
Re: [Bug-wget] Big refactoring, Giuseppe Scrivano, 2012/08/21