[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters
From: |
Bruno Haible |
Subject: |
Re: [PATCH] unistr/u8-strchr: speed up searching for ASCII characters |
Date: |
Sun, 11 Jul 2010 15:38:05 +0200 |
User-agent: |
KMail/1.9.9 |
Hi Pádraig,
> +2010-07-07 Pádraig Brady <address@hidden>
> +
> + * lib/unistr/u8-strchr.c (u8_strchr): Use strchr() as it's faster
Thanks for the patch. I've applied it as below, with minor changes:
- Keep around the unoptimized code, for clarity.
- Add the rationale for the change to the comments, not to the ChangeLog
entry or git commit message.
- In the other comment, mention strstr, not memmem, since - as you noticed
yourself - memmem does not have the appropriate asymptotic behaviour for
long haystack strings.
> gl_memmem long 2 1 3.88
> pb_memmem long 2 1 4.67
> u8_strchr long 2 1 3.47
> gl_memmem 60 2 10000 1.97
> pb_memmem 60 2 10000 1.97
> u8_strchr 60 2 10000 1.96
>
> gl_memmem long 3 1 5.86
> pb_memmem long 3 1 4.02
> u8_strchr long 3 1 4.28
> gl_memmem 60 3 10000 1.97
> pb_memmem 60 3 10000 1.97
> u8_strchr 60 3 10000 1.98
I'm not surprised to see that a search loop that inlines and hardcodes
the iteration of the needle (of known length: 2, 3, or 4) is faster than
the more general strstr or memmem.
2010-07-11 Pádraig Brady <address@hidden>
Bruno Haible <address@hidden>
unistr/u8-strchr: Optimize ASCII argument case.
* lib/unistr/u8-strchr.c (u8_strchr): For ASCII arguments, use strchr.
--- lib/unistr/u8-strchr.c.orig Sun Jul 11 15:25:30 2010
+++ lib/unistr/u8-strchr.c Sun Jul 11 15:21:21 2010
@@ -21,6 +21,8 @@
/* Specification. */
#include "unistr.h"
+#include <string.h>
+
uint8_t *
u8_strchr (const uint8_t *s, ucs4_t uc)
{
@@ -30,18 +32,31 @@
{
uint8_t c0 = uc;
- for (;; s++)
+ if (false)
+ {
+ /* Unoptimized code. */
+ for (;; s++)
+ {
+ if (*s == c0)
+ break;
+ if (*s == 0)
+ goto notfound;
+ }
+ return (uint8_t *) s;
+ }
+ else
{
- if (*s == c0)
- break;
- if (*s == 0)
- goto notfound;
+ /* Optimized code.
+ strchr() is often so well optimized, that it's worth the
+ added function call. */
+ return (uint8_t *) strchr ((const char *) s, c0);
}
- return (uint8_t *) s;
}
else
switch (u8_uctomb_aux (c, uc, 6))
{
+ /* Loops equivalent to strstr, optimized for a specific length (2, 3, 4)
+ of the needle. */
case 2:
if (*s == 0)
goto notfound;