bug-gnulib
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: gnulib's Knuth-Morris-Pratt implementation


From: Bruno Haible
Subject: Re: gnulib's Knuth-Morris-Pratt implementation
Date: Fri, 21 Dec 2007 14:59:10 +0100
User-agent: KMail/1.9.1

Eric Blake wrote:
> I wonder if Boyer-Moore would be any more effective than KMP in the
> average case

For long strings, I think so, yes, according to what I read in wikipedia.

The initialization of an array of 256 bytes can be relatively costly if
the strings are short. But since it's only triggered in the case that
the simple O(n^2) algorithm is running too long, this should not matter.

But this argument also makes it unimportant to use Boyer-Moore over KMP:
Once you've spent 5*n cycles on the simple algorithm, you don't care whether
the computation will be finished in additional 2*n cycles or 2*n/m cycles.

> The main problem
> though is that the B-M algorithm (I guess we'd likely select
> Turbo-Boyer-Moore to reduce the comparisons from 3N to 2N) needs to
> move backward through the haystack.  That doesn't work well on
> multibyte strings.

Yes, that's the reason why I chose KMP in the first place.

Bruno




reply via email to

[Prev in Thread] Current Thread [Next in Thread]