[Top][All Lists]
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
bug#6906: [PATCH] cp: copy entirely-sparse files oodles faster
From: |
Paul Eggert |
Subject: |
bug#6906: [PATCH] cp: copy entirely-sparse files oodles faster |
Date: |
Tue, 24 Aug 2010 22:37:02 -0700 |
User-agent: |
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.11) Gecko/20100713 Thunderbird/3.0.6 |
(By "oodles faster" I mean "as much faster as you like".
The benchmark below shows a 2800x speedup.)
In response to an idea by Kit Westneat for GNU tar reported in
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>,
Eric Blake wrote:
> Meanwhile, if you are indeed correct that there are easy ways to detect
> completely sparse files, even when the ioctl or SEEK_HOLE directives are
> not present, then the coreutils cp(1) hole iteration routine should
> probably be taught that corner case to recognize an entirely sparse file
> as a single hole.
Here's a patch to coreutils to implement this idea. It's based on a patch
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html> that
I just now installed into GNU tar. I think of it as a quick first cut
at full fiemap / SEEK_HOLE implementation, but unlike the full
implementation this optimization does not depend on any special ioctls
or lseek extensions, so it should work on any POSIX or POSIX-like host.
On a simple benchmark this sped up GNU cp by a factor of 2800
(measuring by real-time seconds) on my host:
$ truncate -s 10GB bigfile
$ time old/cp bigfile bigfile-slow
real 2m3.231s
user 0m1.497s
sys 0m5.738s
$ time new/cp bigfile bigfile-fast
real 0m0.044s
user 0m0.000s
sys 0m0.002s
$ ls -ls bigfile*
0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:11 bigfile
0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-fast
0 -rw-r--r-- 1 eggert csfac 10000000000 Aug 24 22:14 bigfile-slow
>From 2e535b590d675e6d96f954c1f840d678fb133f6a Mon Sep 17 00:00:00 2001
From: Paul Eggert <address@hidden>
Date: Tue, 24 Aug 2010 22:20:55 -0700
Subject: [PATCH] cp: copy entirely-sparse files oodles faster
* src/copy.c (copy_reg): Bypass reads if the file is entirely
sparse. Idea suggested for by Kit Westneat via Bernd Shubert in
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00038.html>
for the Lustre file system. Implementation stolen from my patch
<http://lists.gnu.org/archive/html/bug-tar/2010-08/msg00043.html>
to GNU tar. On my machine this sped up a cp benchmark, which
copied a 10 GB entirely-sparse file on an NFS file system, by a
factor of 2800 in real seconds.
---
src/copy.c | 18 +++++++++++++++---
1 files changed, 15 insertions(+), 3 deletions(-)
diff --git a/src/copy.c b/src/copy.c
index 6d11ed8..1e79523 100644
--- a/src/copy.c
+++ b/src/copy.c
@@ -669,10 +669,21 @@ copy_reg (char const *src_name, char const *dst_name,
#endif
}
- /* If not making a sparse file, try to use a more-efficient
- buffer size. */
- if (! make_holes)
+ if (make_holes)
{
+ /* For speed, bypass reads if the file is entirely sparse. */
+
+ if (src_open_sb.st_size != 0 && ST_NBLOCKS (src_open_sb) == 0)
+ {
+ n_read_total = src_open_sb.st_size;
+ goto set_dest_size;
+ }
+ }
+ else
+ {
+ /* Not making a sparse file, so try to use a more-efficient
+ buffer size. */
+
/* Compute the least common multiple of the input and output
buffer sizes, adjusting for outlandish values. */
size_t blcm_max = MIN (SIZE_MAX, SSIZE_MAX) - buf_alignment_slop;
@@ -788,6 +799,7 @@ copy_reg (char const *src_name, char const *dst_name,
if (last_write_made_hole)
{
+ set_dest_size:
if (ftruncate (dest_desc, n_read_total) < 0)
{
error (0, errno, _("truncating %s"), quote (dst_name));
--
1.7.2
[Prev in Thread] |
Current Thread |
[Next in Thread] |
- bug#6906: [PATCH] cp: copy entirely-sparse files oodles faster,
Paul Eggert <=