bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Degraded performance in cat + patch


From: Jim Meyering
Subject: Re: Degraded performance in cat + patch
Date: Fri, 06 Mar 2009 14:45:32 +0100

Pádraig Brady wrote:
...
> Actually reading your preformance results more closely showed
> the throughput actually doubled? That surprises me.

My 50% was *speed-up*.
Of course, throughput is the inverse: 2x.

> Why such a huge syscall overhead? Testing with dd on a
> 1.7GHz pentium-m with 2.6.24.5-85.fc8 shows much less:

Probably due to my faster memory, caches, etc.
The systems I tested on have 800MHz and 667MHz DDRII.
I suspect that the relative syscall overhead is more
apparent with the faster memory.

I have added my numbers from the AMD system running rawhide:

> $ truncate -s2G test.cat
>
> $ dd bs=4x1024 if=test.cat of=/dev/null
> 2147483648 bytes (2.1 GB) copied, 6.57765 s, 326 MB/s

  2147483648 bytes (2.1 GB) copied, 2.0886 s, 1.0 GB/s

> $ dd bs=32x1024 if=test.cat of=/dev/null
> 2147483648 bytes (2.1 GB) copied, 5.74548 s, 374 MB/s

  2147483648 bytes (2.1 GB) copied, 1.06558 s, 2.0 GB/s

> So trying with cat...
>
> $ /usr/bin/time ./cat test.cat >/dev/null
> 0.29user 5.28system 0:06.55elapsed 85%CPU (0avgtext+0avgdata 0maxresident)k
> 472inputs+0outputs (2major+149minor)pagefaults 0swaps

With my 32-KiB buffer:
  0.01user 1.00system 0:01.03elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
  0inputs+0outputs (0major+172minor)pagefaults 0swaps
  0.00user 1.01system 0:01.08elapsed 94%CPU (0avgtext+0avgdata 0maxresident)k
  0inputs+0outputs (0major+172minor)pagefaults 0swaps
  0.00user 1.01system 0:01.02elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k
  0inputs+0outputs (0major+172minor)pagefaults 0swaps
  0.00user 1.01system 0:01.04elapsed 97%CPU (0avgtext+0avgdata 0maxresident)k
  0inputs+0outputs (0major+171minor)pagefaults 0swaps

BTW, thanks for the log message clarification.
Here's what I'll probably use:

    cat: use larger buffer sizes to reduce read/write-syscall overhead

    * src/cat.c (max): Remove definition.  Use MAX from system.h instead.
    (compute_buffer_size): New function to compute the input and output
    buffer sizes, which are now set at 8 times st_blksize with a maximum
    of 32KiB. Previously the typical block sizes used were 1KiB for pipes
    and 4KiB for files, and now will be 8KiB and 32KiB respectively.
    (main): Use it.
    This change can double throughput on modern systems.  For timings,
    see http://article.gmane.org/gmane.comp.gnu.core-utils.bugs/16040
    Suggestion from Tzvi Rotshtein.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]