[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: bsearch utility
From: |
Paul Eggert |
Subject: |
Re: bsearch utility |
Date: |
Fri, 02 Sep 2005 11:35:28 -0700 |
User-agent: |
Gnus/5.1007 (Gnus v5.10.7) Emacs/21.4 (gnu/linux) |
Sorav Bansal <address@hidden> writes:
> I have now ported look.c to the latest CVS repository too. In doing
> so, I noticed that the new keycompare() function no longer calls
> 'trim_trailing_blanks()'. Its not clear to me, why these calls are not
> necessary anymore. Is that because if the keys match till the last
> non-blank character, then you allow the number of trailing blanks to
> be considered as a tie-breaker?
Yes. POSIX requires this and other 'sort' implementations do it that
way, so we thought it better to be compatible. Please see the
2004-04-25 ChangeLog entry.
> These calls are needed for 'look' though. For now, I am adding the
> calls and commenting with XXX in compare.h.
I guess they'll need to be conditional on whether it's 'look' rather
than 'sort'.
>>I don't see the need for the -B option, or the -l option. Can't
>>we omit them?
>>
> I thought '-l' could be useful for looking into unsorted files. '-B'
> can be used for setting the buffer size which affects performance,
> especially when doing linear search.
But doesn't 'grep' do the same job that look -l would? Perhaps I'm
missing something here.
> However, I agree that they are not particularly necessary.
OK, then (unless I'm missing something) let's please omit them for
now. We can always put them in later if there's a need.
Speaking of which, the most important thing needed for this change is
documentation. We need a section in doc/coreutils.texi. I'd put it
just after the 'tsort invocation' section. Also, a couple of lines
in NEWS.
>>Isn't fseeko kind of a loser, performance-wise? Wouldn't it
>>be faster if you used lseek, and took block boundaries into
>>account? I suppose this is more of a tuning thing tho.
>>
> How do you propose using block boundaries into account? I am not sure
> what you mean, so I am leaving the fseeko() calls in place right now.
For POSIX-conformance reasons, fseeko is required to do an underlying
lseek to the exact same position; in effect, buffering is turned off.
Thus you'll often get better performance by seeking to buffer
boundaries, even if you use fseeko. Often it's easier just to use
lseek.
This is just a performance issue, not a correctness issue, so we
shouldn't let it slow us down too much. Better to get it right
(first) and then speed it up. (But I thought I'd mention it anyway.
:-)
> This time, I have tried my best to match the style in
> coreutils. Please point me to specific areas where I am not conforming
> to the coding style and I would be glad to correct them.
These are all fairly minor, but they'll have to be done anyway so
we appreciate your help.
Comments should look like this:
/* Use English sentences. Break long lines
like this. Put two spaces after each period. */
Prefer "const *" to "*".
Don't use "register".
Don't put a space before ";".
Don't parenthesize the expression X in "return X;".
Use x2realloc or x2nrealloc to double a buffer's size, instead of
checking for size overflow by hand.
Prefer 'bool' to 'int' where either will do.
Thanks again for helping with this!
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- Re: bsearch utility,
Paul Eggert <=