bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Degraded performance in cat + patch


From: Jim Meyering
Subject: Re: Degraded performance in cat + patch
Date: Wed, 11 Mar 2009 13:07:16 +0100

Pádraig Brady wrote:
> Jim Meyering wrote:
>> Pádraig Brady wrote:
>>> I was thinking a bit more about the patch just pushed.
>>> It sets the buffer size to 8*st_blksize which seems a
>>> little arbitrary, and also max buffer size is set to
>>> 32KiB even if the file system has a larger st_blksize.
>>> I'm not sure this is desired?
>>>
>>> How about making 32KiB the minimum as in the attached?
>>> The patch also changes `split` and `copy` to use the
>>> same IO size as `cat`.
>> ...
>>> +#define IO_BLKSIZE(statbuf) (MAX(32*1024, ST_BLKSIZE(statbuf)))
>>
>> That looks better.  Thanks.
>>
>> But please make it a static inline function, not a macro.
>> That will be more debugger-friendly.
>
> Updated one attached. Would you like this in 7.2?

Yes, please.

> Note I did a quick audit of the IO in coreutils filters
> to see if this would be appropriate for other utils:
>
> ----------------------------------------------------------
> filter          read            write       stdio_r stdio_w
> ----------------------------------------------------------
...
Nice.  it will take me a while to digest all that.

> Notes...
>
> tee turns off buffering, so I can't really see why it uses stdio at all.
> I think I'll update it to use read/write directly and also change
> the buffer size to 32KiB to match cat et. al.

Maybe not worth it.
I have a preference for using the stream functions (in spite of their
warts), unless there's a strong argument for using lower-level ones.

> There are multiple methods for reading lines that could be consolidated.
>   getline: date, md5sum, dircolors
>   getndelim2: cut (uses getc underneath)
>   linebuffer: nl, join, comm, uniq
>   internal: fold, head, tail, ...

Interesting.

>>From 9c8de5e7240e96dffbd9701a41d9e2e96a7e69c9 Mon Sep 17 00:00:00 2001
> From: =?utf-8?q?P=C3=A1draig=20Brady?= <address@hidden>
> Date: Fri, 6 Mar 2009 22:30:55 +0000
> Subject: [PATCH] cat,cp,mv,install,split: Set the minimum IO block size used 
> to 32KiB
>
> This is following on from 02c3dc9de8d3cf26b319b7a7b144878a2afb91bb

Please make references to earlier SHA1 values shorter,
but include date and one-liner to disambiguate, if ever needed.

> which increased the IO block size used by cat by 8 times,
> but also capped it at 32KiB. Note the above mentioned utils all
> copy data in blocks and do not use stdio streams.

Please remove the "Note ..." sentence, since it doesn't apply
to the change in question.

> * NEWS: Mention the change in behavior.
> * src/system.h: Add a new io_blksize() function that
> returns the max of ST_BLKSIZE or 32KiB, as this value

s/ value//

> was seen as a good value for a minimum block size to use
> to get good performance while minimizing system call overhead.
> * src/cat.c: Use it.
> * src/copy.c: ditto
> * src/split.c: ditto
> ---
>  NEWS         |    5 +++++
>  src/cat.c    |   10 ++--------
>  src/copy.c   |   13 ++-----------
>  src/split.c  |    2 +-
>  src/system.h |   50 ++++++++++++++++++++++++++++++++++++++++++++++++--
>  5 files changed, 58 insertions(+), 22 deletions(-)
>
> diff --git a/NEWS b/NEWS
> index fd101a4..2500280 100644
> --- a/NEWS
> +++ b/NEWS
> @@ -24,6 +24,11 @@ GNU coreutils NEWS                                    -*- 
> outline -*-
>    Previously -k1,1b would have caused leading space from field 2 to be
>    included in the sort while -k2,3.0 would have not included field 3.
>
> +** Changes in behavior
> +
> +  cp,mv,install,cat,split: now read and write a minimum 32KiB of data
> +  at a time. This was seen to increase throughput. Up to 2 times

s/\./. /g
Please put two spaces after each period.

I'm glad you're also replacing uses of ST_BLKSIZE.

...
> diff --git a/src/system.h b/src/system.h
> index ba74da4..9dbf2ba 100644
> --- a/src/system.h
> +++ b/src/system.h
> @@ -147,6 +147,9 @@ enum
>  # define D_INO(dp) NOT_AN_INODE_NUMBER
>  #endif
>
> +/* include here for SIZE_MAX.  */
> +#include <inttypes.h>
> +
>  /* Get or fake the disk device blocksize.
>     Usually defined by sys/param.h (if at all).  */
>  #if !defined DEV_BSIZE && defined BSIZE
> @@ -218,6 +221,51 @@ enum
>  # endif
>  #endif
>
> +/* As of Mar 2009, 32KiB is determined to be the minimium
> + * blksize to best minimize system call overhead.
> + * This can be tested with this script with the results
> + * shown for a 1.7GHz pentium-m with 2GB of 400MHz DDR2 RAM:

Please don't use " * " at the beginning of a comment line.
for consistency.

> +
> +  for i in $(seq 0 10); do
> +    size=$((8*1024**3)) #ensure this is big enough
> +    bs=$((1024*2**$i))
> +    printf "%7s=" $bs
> +    dd bs=$bs if=/dev/zero of=/dev/null count=$(($size/$bs)) 2>&1 |
> +    sed -n 's/.* \([0-9.]* [GM]B\/s\)/\1/p'
> +  done
> +
> +     1024=734 MB/s
> +     2048=1.3 GB/s
> +     4096=2.4 GB/s
> +     8192=3.5 GB/s
> +    16384=3.9 GB/s
> +    32768=5.2 GB/s
> +    65536=5.3 GB/s
> +   131072=5.5 GB/s
> +   262144=5.7 GB/s
> +   524288=5.7 GB/s
> +  1048576=5.8 GB/s
> +
> + * Note that this is to minimize system call overhead.
> + * Other values may be appropriate to minimize file system
> + * or disk overhead. For example on my current linux system

Use two spaces after each period.

> + * the readahead setting is 128KiB which was read using:
> +
> +  file="."
> +  device=$(df -P --local "$file" | tail -n1 | cut -d' ' -f1)
> +  echo $(( $(blockdev --getra $device) * 512 ))
> +
> + * However there isn't a portable way to get the above.
> + * In the future we could use the above method if available
> + * and default to io_blksize() if not.
> + */
> +#define IO_BUFSIZE (32*1024)

Please avoid cpp #defined constants.
Use an anonymous enum instead:

  enum { IO_BUFSIZE = 32*1024; };

> +static inline size_t
> +io_blksize (struct stat sb)
> +{
> +    return MAX (IO_BUFSIZE, ST_BLKSIZE(sb));

Formatting nits:

  Put only two spaces before "return", and one before the opening paren
  after ST_BLKSIZE:

  return MAX (IO_BUFSIZE, ST_BLKSIZE (sb));

> +}
> +
...

Thanks for enduring the nit-picking.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]