[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: 70x speed-up for seq's common cases
From: |
Jim Meyering |
Subject: |
Re: 70x speed-up for seq's common cases |
Date: |
Fri, 14 Sep 2012 09:43:12 +0200 |
Jim Meyering wrote:
> Inspired by Pádraig's April-fools post,
> http://thread.gmane.org/gmane.comp.gnu.coreutils.general/972
> and spurred by the fact that Torbjorn and Niels had to roll
> their own seq-like program to test their new factoring code
> (http://gmplib.org:8000/factoring), I have made coreutils' seq
> usable for arbitrarily large, non-negative whole numbers.
> This has two advantages:
>
> - correctness with large numbers
> - improved performance in common cases
>
> For very large numbers, the new seq now appears to be correct:
>
> $ b=10000000000000000000000000000000000000000000000000000000000000
> $ src/seq ${b}09 ${b}11
> 1000000000000000000000000000000000000000000000000000000000000009
> 1000000000000000000000000000000000000000000000000000000000000010
> 1000000000000000000000000000000000000000000000000000000000000011
>
> while the old one would infloop, printing garbage:
>
> $ seq ${b}09 ${b}11 |head -2
> 999999999999999999965225407341472151966654053379893968647487488
> 999999999999999999965225407341472151966654053379893968647487488
>
> This also means that nt-factor's tests can now use coreutils' seq
> to generate their inputs.
>
> seq's new code is currently enabled only when an increment is not
> specified, or when it is specified as 1. As the FIXME comments in the
> code suggest, this may eventually be relaxed to allow for whole-number
> increments, but for that case, we may well want to use yet another code
> path, using bignums, and with less emphasis on performance.
>
> seq's new incr function is based on the one from cat.c, and the
> rest is based roughly on nt-factor's ourseq.c implementation.
> The initial version relied solely on "puts" for output and failed
> to beat my seq alias, which takes four processes and ~18s to print
> the first 10^9 numbers:
>
> seq()
> {
> # Up to ~25 times faster than seq-8.19 for large N.
> case $# in
> 1) case $1 in
> ''|*[^0-9]*) ;;
> *) yes | head -n"$1" | cat -n | tr -cd '0-9\n'; return;;
> esac ;;
> esac
> env seq "$@"
> }
>
> Switching to fwrite (+putchar for the newline) brought it to parity,
> and buffering enough to cut the number of fwrite calls by a factor
> of 30-40 gave it another ~2.8x, bringing the net speed-up (over
> coreutils-8.19's seq) to about 70x:
>
> $ env time --f=%e src/seq $((10**9)) > /dev/null
> 7.96
>
> Using the seq program from coreutils-8.19 took almost 10 minutes:
>
> $ env time --f=%e seq $((10**9)) > /dev/null
> 597.92
>
> Here's the patch:
>
>>From f7a09b702025b39f566c9e00bdf4226e37ee3912 Mon Sep 17 00:00:00 2001
> From: Jim Meyering <address@hidden>
> Date: Thu, 13 Sep 2012 18:09:49 +0200
> Subject: [PATCH] seq: 70x faster for non-negative whole numbers and incr==1
...
> + /* If the following hold:
> + - no format string, [FIXME: relax this, eventually]
> + - integer start (or no start)
> + - integer end
> + - increment == 1 or not specified [FIXME: relax this, eventually]
> + then use the much more efficient integer-only code. */
> + unsigned int n_args = argc - optind; /* FIXME: hoist this and subst above
> */
Nearly forgot to address one of those FIXME comments.
I've folded the following into that commit:
diff --git a/src/seq.c b/src/seq.c
index 028d718..edf00aa 100644
--- a/src/seq.c
+++ b/src/seq.c
@@ -505,13 +505,14 @@ main (int argc, char **argv)
}
}
- if (argc - optind < 1)
+ unsigned int n_args = argc - optind;
+ if (n_args < 1)
{
error (0, 0, _("missing operand"));
usage (EXIT_FAILURE);
}
- if (3 < argc - optind)
+ if (3 < n_args)
{
error (0, 0, _("extra operand %s"), quote (argv[optind + 3]));
usage (EXIT_FAILURE);
@@ -533,7 +534,6 @@ main (int argc, char **argv)
- integer end
- increment == 1 or not specified [FIXME: relax this, eventually]
then use the much more efficient integer-only code. */
- unsigned int n_args = argc - optind; /* FIXME: hoist this and subst above */
if (format_str == NULL
&& all_digits_p (argv[1])
&& (n_args == 1 || all_digits_p (argv[2]))