bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: better buffer size for copy


From: Phillip Susi
Subject: Re: better buffer size for copy
Date: Tue, 22 Nov 2005 14:18:37 -0500
User-agent: Thunderbird 1.5 (Windows/20051025)

Robert Latham wrote:
I mean no offense cutting out most of your points.  You describe great
ways to achieve high I/O rates for anyone writing a custom file mover.
I shouldn't have mentioned network file systems.  It's a distraction
from the real point of my patch: cp(1) should consider both the source
and the destination st_blksize.


No problem... I kind of went off on a tangent there.

All I expect from st_blksize is what the stat(2)
manpage suggests:

       The value st_blocks gives the size of  the  file  in  512-byte
       blocks.  (This  may  be  smaller than st_size/512 e.g. when the
       file has holes.) The value st_blksize gives the "preferred"
       blocksize for efficient file system  I/O.  (Writing to a file
       in smaller chunks may cause an inefficient
       read-modify-rewrite.)

All I really want is for cp(1) to do the right thing no matter what
the soruce or destination st_blksize value might be.

Ok, I see what you are talking about now. Using a copy block size smaller than the filesystem block size can result in a lot of extra IO, thus reducing throughput. Of course, this doesn't really apply in a typical use case because the kernel will cache the writes and combine them when it flushes the IO to disk, however, yes... it is a good idea to use an IO block size that is at least as large as the larger of the source and destination filesystem block sizes.

In copying from a 4k blocksize file sytem to a 64k blocksize
filesystem, cp(1) will perform well, as it is using a 64k buffer.
In copying *from* that 64k blocksize filesystem *to* a 4k blocksize
filesytem, cp(1) will not perform as well: it's using a 4k buffer and
so reading from the source filesystem in less-than-ideal chunks.


Again, this probably won't happen in real practice due to the influence of the filesystem cache, but I do see your point. In practice though, I don't know of any filesystem with a 64k block size. By default ext2/3 use 1k, and reiserfs uses 4k. These are going to be typical values for st_blksize, yet if you use a copy block size of say, 64k, I think you will find the performance to be significantly better than either 1k or 4k. I think that a good case in point is copying to/from a typical ext2/3 filesystem using a 1k block size. Using a buffer smaller than a single 4k page is going to significantly degrade performance. You certainly do not want to go smaller than the block size, but really, you should be going larger.

Thanks again for taking the time to respond.  I hope I have made the
intent of my patch more clear.
==rob


You did... and I thank you as well and hope that I have made myself more clear.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]