bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] sort: Add --threads option, which parallelizes internal sort.


From: Paul Eggert
Subject: [PATCH] sort: Add --threads option, which parallelizes internal sort.
Date: Wed, 25 Mar 2009 13:46:22 -0700

This patch is by Glen Lenker, Matt Pham, Benjamin Nuernberger, Sky
Lin, TaeSung Roh, and Paul Eggert.  It adds support for parallelism
within an internal sort.  On our simple tests on a 2-core desktop x86,
overall performance improved by roughly a factor of 1.6.
* THANKS: Add Benjamin Nuernberger, Sky Lin, TaeSung Roh.  Sort.
* TODO: Add note about how to make this algorithm better.
* bootstrap.conf: Add nproc, pthread.
* doc/coreutils.texi (sort invocation): Document new --threads option.
* src/Makefile.am (sort_LDADD): Add $(LIB_PTHREAD).
* src/sort.c: Include pthread.h, nproc.h.
(SUBTHREAD_LINES_HEURISTIC, THREADS_OPTION): New constants.
(sortlines_temp): Remove decl.
(usage, long_options, main): New option --threads.
(specify_nthreads): New function.
(mergelines): New signature, to emphasize the fact that the HI area must be
part of the destination.  All callers changed.
(sortlines): Rewrite to support and use parallelism, with a new signature.
All uses changed.  Merge in the functionality of sortlines_temp.
(struct thread_args): New struct.
(sortlines_thread): New function.
(sortlines_temp): Remove.
(sort): New arg NTHREADS.  All uses changed.
(main): Disable threading if we are sorting at random.
* tests/Makefile.am (TESTS): Add misc/sort-benchmark-random.
* tests/misc/sort-benchmark-random: New file.
---
 THANKS                           |   23 +++--
 TODO                             |    8 ++
 bootstrap.conf                   |    9 ++-
 doc/coreutils.texi               |    8 ++
 src/Makefile.am                  |    2 +-
 src/sort.c                       |  220 +++++++++++++++++++++++++++-----------
 tests/Makefile.am                |    1 +
 tests/misc/sort-benchmark-random |   67 ++++++++++++
 8 files changed, 263 insertions(+), 75 deletions(-)
 create mode 100644 tests/misc/sort-benchmark-random

diff --git a/THANKS b/THANKS
index 665a9ef..e37fdb3 100644
--- a/THANKS
+++ b/THANKS
@@ -24,12 +24,11 @@ aldomel                             address@hidden
 Alen Muzinic                        address@hidden
 Alexander Nguyen                    address@hidden
 Alexander V. Lukyanov               address@hidden
-Allen Hewes                         address@hidden
-Axel Dörfler                        address@hidden
 Alexandre Duret-Lutz                address@hidden
 Alexey Solovyov                     address@hidden
 Alexey Vyskubov                     address@hidden
 Alfred M. Szmidt                    address@hidden
+Allen Hewes                         address@hidden
 Andi Kleen                          address@hidden
 Andre Novaes Cunha                  address@hidden
 Andreas Frische                     address@hidden
@@ -61,13 +60,15 @@ Arvind Autar                        address@hidden
 Augey Mikus                         address@hidden
 Aurelien Jarno                      address@hidden
 Austin Donnelly                     address@hidden
+Axel Dörfler                        address@hidden
 Axel Kittenberger                   address@hidden
 Barry Kelly                         http://barrkel.blogspot.com/
 Bauke Jan Douma                     address@hidden
 Ben Elliston                        address@hidden
 Ben Harris                          address@hidden
-Benjamin Cutler                     address@hidden
 Bengt Martensson                    address@hidden
+Benjamin Cutler                     address@hidden
+Benjamin Nuernberger                address@hidden
 Bernard Giroud                      address@hidden
 Bernd Eckenfels                     address@hidden
 Bernd Leibing                       address@hidden
@@ -169,12 +170,12 @@ Elbert Pol                          address@hidden
 Eli Zaretskii                       address@hidden
 Elias Pipping                       address@hidden
 Emile LeBlanc                       address@hidden
-Erik Auerswald                      address@hidden
 Eric Backus                         address@hidden
 Eric Blake                          address@hidden
 Eric G. Miller                      address@hidden
 Eric Pemente                        address@hidden
 Eric S. Raymond                     address@hidden
+Erik Auerswald                      address@hidden
 Erik Bennett                        address@hidden
 Erik Corry                          address@hidden
 Evan Hunt                           address@hidden
@@ -207,7 +208,6 @@ Gerhard Poul                        address@hidden
 Germano Leichsenring                address@hidden
 Glen Lenker                         address@hidden
 Göran Uddeborg                      address@hidden
-Guochun Shi                         address@hidden
 GOTO Masanori                       address@hidden
 Greg Louis                          address@hidden
 Greg McGary                         address@hidden
@@ -218,6 +218,7 @@ Greg Wooledge                       address@hidden
 Gregory Leblanc                     address@hidden
 Guido Leenders                      address@hidden
 Guntram Blohm                       address@hidden
+Guochun Shi                         address@hidden
 H. J. Lu                            address@hidden
 Hans Ginzel                         address@hidden
 Hans Lermen                         address@hidden
@@ -232,8 +233,8 @@ Herbert Xu                          address@hidden
 Holger Berger                       address@hidden
 Hon-Yin Kok                         address@hidden
 Hugh Daniel                         address@hidden
-Ian Bruce                           address@hidden
 Iain Calder                         address@hidden
+Ian Bruce                           address@hidden
 Ian Jackson                         address@hidden
 Ian Lance Taylor                    address@hidden
 Ian Turner                          address@hidden
@@ -243,8 +244,8 @@ Ingo Saitz                          address@hidden
 Ivo Timmermans                      address@hidden
 James                               address@hidden
 James Antill                        address@hidden
-James Lemley                        address@hidden
 James Hunt                          address@hidden
+James Lemley                        address@hidden
 James Ralston                       address@hidden
 James Sneeringer                    address@hidden
 James Tanis                         address@hidden
@@ -318,8 +319,8 @@ Keith Thompson                      address@hidden
 Ken Pizzini                         address@hidden
 Kevin Mudrick                       address@hidden
 Kirk Kelsey                         address@hidden
-Kristin E Thomas                    address@hidden
 Kjetil Torgrim Homme                address@hidden
+Kristin E Thomas                    address@hidden
 Kristoffer Rose                     address@hidden
 Larry McVoy                         address@hidden
 Lars Hecking                        address@hidden
@@ -408,8 +409,8 @@ Michiel Bacchiani                   address@hidden
 Mikael Magnusson                    address@hidden
 Mike Castle                         address@hidden
 Mike Coleman                        address@hidden
-Mike Jetzer                         address@hidden
 Mike Frysinger                      address@hidden
+Mike Jetzer                         address@hidden
 Mikko Tuumanen                      address@hidden
 Mikulas Patocka                     address@hidden
 Miles Bader                         address@hidden
@@ -510,6 +511,7 @@ Savochkin Andrey Vladimirovich      address@hidden
 Scott Lurndal                       address@hidden
 Sébastien Maret                     address@hidden
 Shing-Shong Shei                    address@hidden
+Sky Lin                             address@hidden
 Soeren Sonnenburg                   address@hidden
 Solar Designer                      address@hidden
 Stanislav Ievlev                    address@hidden
@@ -530,9 +532,10 @@ Stuart Shelton                      address@hidden
 Sven Joachim                        address@hidden
 Szakacsits Szabolcs                 address@hidden
 Tadayoshi Funaba                    address@hidden
+TaeSung Roh                         address@hidden
 TAKAI Kousuke                       address@hidden
-Theodore Ts'o                       address@hidden
 The Wanderer                        address@hidden
+Theodore Ts'o                       address@hidden
 Theodoros V. Kalamatianos           address@hidden
 Thomas Bushnell                     address@hidden
 Thomas Goerlich                     address@hidden
diff --git a/TODO b/TODO
index 7288285..ee7e850 100644
--- a/TODO
+++ b/TODO
@@ -102,6 +102,14 @@ sort: Investigate better sorting algorithms; see Knuth 
vol. 3.
   5.3.1, who credits Lester Ford, Jr. and Selmer Johnson, American
   Mathematical Monthly 66 (1959), 387-389.
 
+  The current method of parallelization can be improved.  Parent
+  threads wait for children to finish, which can be a bottleneck.  It
+  might be better, at least when computation nears the root of the
+  merge tree, for parent threads to accept outputs from their children
+  one by one, and thus merge while their children are merging.  This
+  will require some inter-thread communication (e.g., via shared
+  objects that are accessed safely via mutual exclusion).
+
 shred: Update shred as described here to conform to DoD 5220 rules:
 http://lists.gnu.org/archive/html/bug-coreutils/2007-05/msg00075.html
 
diff --git a/bootstrap.conf b/bootstrap.conf
index 0747bb8..7358e10 100644
--- a/bootstrap.conf
+++ b/bootstrap.conf
@@ -74,12 +74,19 @@ gnulib_modules="
        memcasecmp memcmp2 mempcpy
        memrchr mgetgroups
        mkancesdirs mkdir mkdir-p mkstemp mktime modechange
-       mountlist mpsort obstack pathmax perl physmem
+       mountlist
+       mpsort
+       nproc
+       obstack
+       pathmax
+       perl
+       physmem
        posix-shell
        posixtm
        posixver
        progname
        propername
+       pthread
        putenv
        quote quotearg raise readlink areadlink-with-size
        randint
diff --git a/doc/coreutils.texi b/doc/coreutils.texi
index 70effa1..a9ad20e 100644
--- a/doc/coreutils.texi
+++ b/doc/coreutils.texi
@@ -4043,6 +4043,14 @@ have a large sort or merge that is I/O-bound, you can 
often improve
 performance by using this option to specify directories on different
 disks and controllers.
 
address@hidden address@hidden
+Use no more than @var{n} threads to improve parallelism and thus
+reduce the wall-clock time needed for the sort.  The default @var{n}
+is the number of processors currently online, rounded down to the
+nearest power of two.  This default may change in the future.  If
address@hidden is a power of two, increasing it to a value less than 
address@hidden
+does not typically help performance.
+
 @item -u
 @itemx --unique
 @opindex -u
diff --git a/src/Makefile.am b/src/Makefile.am
index 2313ed3..6113e5f 100644
--- a/src/Makefile.am
+++ b/src/Makefile.am
@@ -118,7 +118,7 @@ tac_LDADD = $(LDADD) $(LIB_CLOCK_GETTIME)
 vdir_LDADD = $(LDADD) $(LIB_CLOCK_GETTIME) $(LIB_SELINUX) $(LIB_CAP)
 
 ## If necessary, add -lm to resolve use of pow in lib/strtod.c.
-sort_LDADD = $(LDADD) $(POW_LIB) $(LIB_GETHRXTIME)
+sort_LDADD = $(LDADD) $(POW_LIB) $(LIB_GETHRXTIME) $(LIB_PTHREAD)
 
 # for get_date and gettime
 date_LDADD = $(LDADD) $(LIB_CLOCK_GETTIME)
diff --git a/src/sort.c b/src/sort.c
index ced0f2d..b4f0dfb 100644
--- a/src/sort.c
+++ b/src/sort.c
@@ -23,6 +23,7 @@
 #include <config.h>
 
 #include <getopt.h>
+#include <pthread.h>
 #include <sys/types.h>
 #include <sys/wait.h>
 #include <signal.h>
@@ -32,6 +33,7 @@
 #include "filevercmp.h"
 #include "hash.h"
 #include "md5.h"
+#include "nproc.h"
 #include "physmem.h"
 #include "posixver.h"
 #include "quote.h"
@@ -83,6 +85,13 @@ struct rlimit { size_t rlim_cur; };
 # define OPEN_MAX 20
 #endif
 
+/* Heuristic value for the number of lines for which it is worth
+   creating a subthread, during an internal merge sort, on a machine
+   that has processors galore.  Currently this number is just a guess.
+   This value must be at least 4.  We don't know of any machine where
+   this number has any practical effect.  */
+enum { SUBTHREAD_LINES_HEURISTIC = 4 };
+
 #define UCHAR_LIM (UCHAR_MAX + 1)
 
 #ifndef DEFAULT_TMPDIR
@@ -289,8 +298,6 @@ static char const *compress_program;
    number are present, temp files will be used. */
 static unsigned int nmerge = NMERGE_DEFAULT;
 
-static void sortlines_temp (struct line *, size_t, struct line *);
-
 /* Report MESSAGE for FILE, then clean up and exit.
    If FILE is null, it represents standard output.  */
 
@@ -380,6 +387,7 @@ Other options:\n\
   -t, --field-separator=SEP  use SEP instead of non-blank to blank 
transition\n\
   -T, --temporary-directory=DIR  use DIR for temporaries, not $TMPDIR or %s;\n\
                               multiple options specify multiple directories\n\
+      --threads=N           use no more than N threads to improve 
parallelism\n\
   -u, --unique              with -c, check for strict ordering;\n\
                               without -c, output only the first of an equal 
run\n\
 "), DEFAULT_TMPDIR);
@@ -423,7 +431,8 @@ enum
   FILES0_FROM_OPTION,
   NMERGE_OPTION,
   RANDOM_SOURCE_OPTION,
-  SORT_OPTION
+  SORT_OPTION,
+  THREADS_OPTION
 };
 
 static char const short_options[] = "-bcCdfgik:mMno:rRsS:t:T:uVy:z";
@@ -455,6 +464,7 @@ static struct option const long_options[] =
   {"temporary-directory", required_argument, NULL, 'T'},
   {"unique", no_argument, NULL, 'u'},
   {"zero-terminated", no_argument, NULL, 'z'},
+  {"threads", required_argument, NULL, THREADS_OPTION},
   {GETOPT_HELP_OPTION_DECL},
   {GETOPT_VERSION_OPTION_DECL},
   {NULL, 0, NULL, 0},
@@ -1253,6 +1263,21 @@ specify_sort_size (int oi, char c, char const *s)
   xstrtol_fatal (e, oi, c, long_options, s);
 }
 
+/* Specify the number of threads to spawn during internal sort.  */
+static unsigned long int
+specify_nthreads (int oi, char c, char const *s)
+{
+  unsigned long int nthreads;
+  enum strtol_error e = xstrtoul (s, NULL, 10, &nthreads, "");
+  if (e == LONGINT_OVERFLOW)
+    return ULONG_MAX;
+  if (e != LONGINT_OK)
+    xstrtol_fatal (e, oi, c, long_options, s);
+  if (nthreads == 0)
+    error (SORT_FAILURE, 0, _("number of threads must be nonzero"));
+  return nthreads;
+}
+
 /* Return the default sort size.  */
 static size_t
 default_sort_size (void)
@@ -2435,25 +2460,28 @@ mergefiles (struct sortfile *files, size_t ntemps, 
size_t nfiles,
   return nopened;
 }
 
-/* Merge into T the two sorted arrays of lines LO (with NLO members)
-   and HI (with NHI members).  T, LO, and HI point just past their
-   respective arrays, and the arrays are in reverse order.  NLO and
-   NHI must be positive, and HI - NHI must equal T - (NLO + NHI).  */
+/* Merge into T (of size NLINES) the two sorted arrays of lines
+   LO (with NLINES / 2 members), and
+   T - (NLINES / 2) (with NLINES - NLINES / 2 members).
+   T and LO point just past their respective arrays, and the arrays
+   are in reverse order.  NLINES must be at least 2.  */
 
 static inline void
-mergelines (struct line *t,
-           struct line const *lo, size_t nlo,
-           struct line const *hi, size_t nhi)
+mergelines (struct line *restrict t, size_t nlines,
+           struct line const *restrict lo)
 {
+  size_t nlo = nlines / 2;
+  size_t nhi = nlines - nlo;
+  struct line const *hi = t - nlo;
+
   for (;;)
     if (compare (lo - 1, hi - 1) <= 0)
       {
        *--t = *--lo;
        if (! --nlo)
          {
-           /* HI - NHI equalled T - (NLO + NHI) when this function
-              began.  Therefore HI must equal T now, and there is no
-              need to copy from HI to T.  */
+           /* HI must equal T now, and there is no need to copy from
+              HI to T.  */
            return;
          }
       }
@@ -2471,53 +2499,54 @@ mergelines (struct line *t,
       }
 }
 
-/* Sort the array LINES with NLINES members, using TEMP for temporary space.
-   NLINES must be at least 2.
-   The input and output arrays are in reverse order, and LINES and
-   TEMP point just past the end of their respective arrays.
 
-   Use a recursive divide-and-conquer algorithm, in the style
-   suggested by Knuth volume 3 (2nd edition), exercise 5.2.4-23.  Use
-   the optimization suggested by exercise 5.2.4-10; this requires room
-   for only 1.5*N lines, rather than the usual 2*N lines.  Knuth
-   writes that this memory optimization was originally published by
-   D. A. Bell, Comp J. 1 (1958), 75.  */
+static void sortlines (struct line *restrict, size_t, struct line *restrict,
+                      unsigned long int, bool);
 
-static void
-sortlines (struct line *lines, size_t nlines, struct line *temp)
+/* Thread arguments for sortlines_thread.  */
+struct thread_args
 {
-  if (nlines == 2)
-    {
-      if (0 < compare (&lines[-1], &lines[-2]))
-       {
-         struct line tmp = lines[-1];
-         lines[-1] = lines[-2];
-         lines[-2] = tmp;
-       }
-    }
-  else
-    {
-      size_t nlo = nlines / 2;
-      size_t nhi = nlines - nlo;
-      struct line *lo = lines;
-      struct line *hi = lines - nlo;
-      struct line *sorted_lo = temp;
-
-      sortlines (hi, nhi, temp);
-      if (1 < nlo)
-       sortlines_temp (lo, nlo, sorted_lo);
-      else
-       sorted_lo[-1] = lo[-1];
+  struct line *lines;
+  size_t nlines;
+  struct line *temp;
+  unsigned long int nthreads;
+  bool to_temp;
+};
 
-      mergelines (lines, sorted_lo, nlo, hi, nhi);
-    }
+/* Like sortlines, except with a signature acceptable to pthread_create.  */
+static void *
+sortlines_thread (void *data)
+{
+  struct thread_args const *args = data;
+  sortlines (args->lines, args->nlines, args->temp, args->nthreads,
+            args->to_temp);
+  return NULL;
 }
 
-/* Like sortlines (LINES, NLINES, TEMP), except output into TEMP
-   rather than sorting in place.  */
+/* Sort the array LINES with NLINES members, using TEMP for temporary space,
+   spawning NTHREADS threads.  NLINES must be at least 2.
+   The input and output arrays are in reverse order, and LINES and
+   TEMP point just past the end of their respective arrays.
+
+   If TO_TEMP, place the result in TEMP (trashing LINES in the
+   process); otherwise, place the result back into LINES so that it is
+   an in-place sort (trashing TEMP in the process).
+
+   Use a recursive divide-and-conquer algorithm, in the style
+   suggested by Knuth volume 3 (2nd edition), exercise 5.2.4-23.  If
+   multithreaded, this requires that TEMP contain NLINES entries; if
+   singlethreaded, use the optimization suggested by Knuth exercise
+   5.2.4-10, which requires room for only 1.5*N lines, rather than the
+   usual 2*N lines.  Knuth writes that this memory optimization was
+   originally published by D. A. Bell, Comp J. 1 (1958), 75.
+
+   This function is inline so that its tests for multthreadedness and
+   inplacedness can be optimized away in common cases.  */
 
 static void
-sortlines_temp (struct line *lines, size_t nlines, struct line *temp)
+sortlines (struct line *restrict lines, size_t nlines,
+          struct line *restrict temp,
+          unsigned long int nthreads, bool to_temp)
 {
   if (nlines == 2)
     {
@@ -2525,8 +2554,17 @@ sortlines_temp (struct line *lines, size_t nlines, 
struct line *temp)
         <http://lists.gnu.org/archive/html/bug-coreutils/2005-10/msg00086.html>
         in the IBM xlc 6.0.0.0 compiler in 64-bit mode.  */
       int swap = (0 < compare (&lines[-1], &lines[-2]));
-      temp[-1] = lines[-1 - swap];
-      temp[-2] = lines[-2 + swap];
+      if (to_temp)
+       {
+         temp[-1] = lines[-1 - swap];
+         temp[-2] = lines[-2 + swap];
+       }
+      else if (swap)
+       {
+         temp[-1] = lines[-1];
+         lines[-1] = lines[-2];
+         lines[-2] = temp[-1];
+       }
     }
   else
     {
@@ -2534,13 +2572,43 @@ sortlines_temp (struct line *lines, size_t nlines, 
struct line *temp)
       size_t nhi = nlines - nlo;
       struct line *lo = lines;
       struct line *hi = lines - nlo;
-      struct line *sorted_hi = temp - nlo;
 
-      sortlines_temp (hi, nhi, sorted_hi);
-      if (1 < nlo)
-       sortlines (lo, nlo, temp);
+      unsigned long int child_subthreads = nthreads / 2;
+      unsigned long int my_subthreads = nthreads - child_subthreads;
+      pthread_t thread;
+      struct thread_args args = {hi, nhi, temp - nlo, child_subthreads, 
to_temp};
+
+      if (child_subthreads != 0 && SUBTHREAD_LINES_HEURISTIC <= nlines
+         && pthread_create (&thread, NULL, sortlines_thread, &args) == 0)
+       {
+         /* Guarantee that nlo and nhi are each at least 2.  */
+         verify (4 <= SUBTHREAD_LINES_HEURISTIC);
+
+         sortlines (lo, nlo, temp, my_subthreads, !to_temp);
+         pthread_join (thread, NULL);
+       }
+      else
+       {
+         sortlines (hi, nhi, temp - (to_temp ? nlo : 0), 1, to_temp);
+         if (1 < nlo)
+           sortlines (lo, nlo, temp, 1, !to_temp);
+         else if (!to_temp)
+           temp[-1] = lo[-1];
+       }
 
-      mergelines (temp, lo, nlo, sorted_hi, nhi);
+      struct line *dest;
+      struct line const *sorted_lo;
+      if (to_temp)
+       {
+         dest = temp;
+         sorted_lo = lines;
+       }
+      else
+       {
+         dest = lines;
+         sorted_lo = temp;
+       }
+      mergelines (dest, nlines, sorted_lo);
     }
 }
 
@@ -2744,7 +2812,8 @@ merge (struct sortfile *files, size_t ntemps, size_t 
nfiles,
 /* Sort NFILES FILES onto OUTPUT_FILE. */
 
 static void
-sort (char * const *files, size_t nfiles, char const *output_file)
+sort (char * const *files, size_t nfiles, char const *output_file,
+      unsigned long int nthreads)
 {
   struct buffer buf;
   size_t ntemps = 0;
@@ -2758,8 +2827,11 @@ sort (char * const *files, size_t nfiles, char const 
*output_file)
       char const *file = *files;
       FILE *fp = xfopen (file, "r");
       FILE *tfp;
+
+      /* If singlethreaded, the merge uses the memory optimization
+        suggested in Knuth exercise 5.2.4-10; see sortlines.  */
       size_t bytes_per_line = (2 * sizeof (struct line)
-                              - sizeof (struct line) / 2);
+                              - (1 < nthreads ? 0 : sizeof (struct line) / 2));
 
       if (! buf.alloc)
        initbuf (&buf, bytes_per_line,
@@ -2787,7 +2859,7 @@ sort (char * const *files, size_t nfiles, char const 
*output_file)
          line = buffer_linelim (&buf);
          linebase = line - buf.nlines;
          if (1 < buf.nlines)
-           sortlines (line, buf.nlines, linebase);
+           sortlines (line, buf.nlines, linebase, nthreads, false);
          if (buf.eof && !nfiles && !ntemps && !buf.left)
            {
              xfclose (fp, file);
@@ -3039,6 +3111,7 @@ main (int argc, char **argv)
   bool mergeonly = false;
   char *random_source = NULL;
   bool need_random = false;
+  unsigned long int nthreads = 0;
   size_t nfiles = 0;
   bool posixly_correct = (getenv ("POSIXLY_CORRECT") != NULL);
   bool obsolete_usage = (posix2_version () < 200112);
@@ -3361,6 +3434,10 @@ main (int argc, char **argv)
          add_temp_dir (optarg);
          break;
 
+       case THREADS_OPTION:
+         nthreads = specify_nthreads (oi, c, optarg);
+         break;
+
        case 'u':
          unique = true;
          break;
@@ -3506,6 +3583,9 @@ main (int argc, char **argv)
 
   if (need_random)
     {
+      /* Threading does not lock the randread_source structure, so
+        downgrade to one thread to avoid race conditions. */
+      nthreads = 1;
       randread_source = randread_new (random_source, MD5_DIGEST_SIZE);
       if (! randread_source)
        die (_("open failed"), random_source);
@@ -3560,7 +3640,21 @@ main (int argc, char **argv)
       IF_LINT (free (sortfiles));
     }
   else
-    sort (files, nfiles, outfile);
+    {
+      if (!nthreads)
+       {
+         /* The default NTHREADS is 2 ** floor (log2 (number of processors)).
+            If comparisons do not vary in cost and threads are
+            scheduled evenly, with the current algorithm there is no
+            performance advantage to using a number of threads that
+            is not a power of 2.  */
+         unsigned long int np2 = num_processors () / 2;
+         for (nthreads = 1; nthreads <= np2; nthreads *= 2)
+           continue;
+       }
+
+      sort (files, nfiles, outfile, nthreads);
+    }
 
   if (have_read_stdin && fclose (stdin) == EOF)
     die (_("close failed"), "-");
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 5f150ad..2e2bfec 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -201,6 +201,7 @@ TESTS =                                             \
   misc/shred-remove                            \
   misc/shuf                                    \
   misc/sort                                    \
+  misc/sort-benchmark-random                   \
   misc/sort-compress                           \
   misc/sort-continue                           \
   misc/sort-files0-from                                \
diff --git a/tests/misc/sort-benchmark-random b/tests/misc/sort-benchmark-random
new file mode 100644
index 0000000..2bace4f
--- /dev/null
+++ b/tests/misc/sort-benchmark-random
@@ -0,0 +1,67 @@
+#!/bin/sh
+# Benchmark sort on randomly generated data.
+
+# Copyright (C) 2009 Free Software Foundation, Inc.
+
+# This program is free software: you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation, either version 3 of the License, or
+# (at your option) any later version.
+
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+# You should have received a copy of the GNU General Public License
+# along with this program.  If not, see <http://www.gnu.org/licenses/>.
+
+# Written by Glen Lenker.
+
+if test "$VERBOSE" = yes; then
+  set -x
+  sort --version
+fi
+
+. $srcdir/test-lib.sh
+
+very_expensive_
+
+# Use the 'C' locale to avoid problems in case Perl's sort isn't stable.
+LC_ALL=C
+export LC_ALL
+
+fail=0
+
+perl -e '
+my $num_lines = 500000;
+my $length = 100;
+
+for (my $i=0; $i < $num_lines; $i++)
+{
+    for (my $j=0; $j < $length; $j++)
+    {
+       printf "%c", 32 + rand(94);
+    }
+    print "\n";
+}' > in || framework_failure
+
+# We need to generate a lot of data for sort to show a noticeable
+# improvement in performance. Sorting it in PERL may take awhile.
+
+perl -e '
+open (FILE, "<in");
+my @list = <FILE>;
+print sort(@list);
+close (FILE);
+' > exp || framework_failure
+
+#start=$(date +%s)
+time sort in > out || fail=1
+#duration=$(expr $(date +%s) - $start)
+
+#echo sorting the random data took $duration seconds
+
+compare out exp || fail=1
+
+Exit $fail
-- 
1.6.2.1





reply via email to

[Prev in Thread] Current Thread [Next in Thread]