bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#6131: [PATCH]: fiemap support for efficient sparse file copy


From: Sunil Mushran
Subject: bug#6131: [PATCH]: fiemap support for efficient sparse file copy
Date: Thu, 27 May 2010 10:28:10 -0700
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.1.9) Gecko/20100423 Thunderbird/3.0.4

Jim Meyering wrote:
Hi Jeff,

I've included below the state of my local changes.
Unfortunately, with that 5-patch series, there is always a test failure
on F13/ext4.  Maybe someone who knows more about extents can provide an
explanation?

Here's a small example to demonstrate:

Create a file with many extents:

     perl -e 'BEGIN { $n = 19 * 1024; *F = *STDOUT }' \
       -e 'for (1..100) { sysseek (*F, $n, 1)' \
       -e '&&  syswrite (*F, "."x$n) or die "$!"}'>  j1

Using the patched "cp", repeat the following 10 or 20 times:

     ./cp --sparse=always j1 j2; sync
     filefrag -v j1 | awk '/^ / {print $1,$2}'>  ff1 || fail=1
     filefrag -v j2 | awk '/^ / {print $1,$2}'>  ff2 || fail=1
     diff -u ff1 ff2 || fail=1

Usually there is no diff output, but occasionally it'll print this:
[hmm... today it consistently prints these differences every other time.]


The reason it does not work is because the sparse file created by cp
may not be sparse (or sparse enough). And that is because cp reads
is chunks of st_blocksize and skips the write only if the entire chunk
is zero. The perl script creates the file in 19K chunks (alternate writes
and holes).

So on a 4K fs, the file created by the script will have 4 blocks as holes
(avg). But when cp makes it, it could fill out those holes because the read
granularity could be coarser. For example, ocfs2 fills out st_blocksize
with the fs cluster size (allocation size) which could be larger than the
block size.

My suggestion is to not use filefrag but to use md5sum to compare the two
files. In the end, the aim of this feature is to reduce the number of reads.
Furthermore, the number of extents created is a function of not only cp but
also the file system. (A fs does not guarantee the number of extents a file
will have. It only guarantees that the data out is the same as data in.)





reply via email to

[Prev in Thread] Current Thread [Next in Thread]