bug-coreutils
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

bug#18681: cp Specific fail example


From: Leslie S Satenstein
Subject: bug#18681: cp Specific fail example
Date: Sun, 12 Oct 2014 19:11:45 -0700

Further to Bob's explanation,
If you were to copy a 10gig file across the internet. cp would work just fine 
and could take several hours.  But suppose there was an error in the 
transmission (bad block) or you had to stop and restart. you would need to redo 
cp and copy the file from the beginning.  Rsync would take a checksum of the 
parts of the file on the remote, and compare it to the host. It would restart 
at the first detected bad file offset.

 


Regards 

 Leslie

Mr. Leslie Satenstein
Montreal, Quebec, Canada




>________________________________
> From: Bob Proulx <address@hidden>
>To: Linda Walsh <address@hidden> 
>Cc: address@hidden; "Polehn, Mike A" <address@hidden> 
>Sent: Sunday, October 12, 2014 8:54 PM
>Subject: bug#18681: cp Specific fail example
> 
>
>Linda Walsh wrote:
>> Bob Proulx wrote:
>> > Meanwhile...  I would be one of those suggesting that perhaps you
>> > should try using rsync instead of cp.  The cp command is lean and
>> > mean by comparison to rsync (and should stay that way).  But rsync
>> > has many attractive features for doing large copies.
>>
>> ---- fwiw...---
>> Like large execution times... from the latest snapshot on my system --
>> I use rsync to only move differences between  yesterday and "today[whenever
>> new snap is taken]"... it was a larger than normal snap -- most only
>> take 75-90 minutes...but rsync (these are the script messages) with some
>> debugging output still turned on... even an rm over the resulting diff
>> took 101 seconds... then cp comes along.. even w/a sync it would
>> still be under a minute.
>
>Wow.  Just to be clear an rsync copy took 75 to 90 minutes but a cp
>copy took less than 1 minute?  I find that very suspicious.  I never
>see that much difference between them.  Are you sure the difference
>wasn't that the data was cached into ram by the rsync and therefore
>the second run with cp just ran with the warmed up cache?  With a
>large data set and a large ram that is plausible.
>
>> I.e. rsync copied just the diffs to "/home.diff", then
>> find with "-empty -delete" is used to get rid of empty dirs (rsync
>> creates many of these).  then a static partition is created to hold
>> the "diff" output -- and cp took walked and copied the tree in 12s.
>> (output wasn't flushed, but it's not that long.. <a minute...).
>
>It appears that you are using features from rsync that do not exist in
>cp.  Therefore the work being done in the task isn't equivalent work.
>In that case it is probably quite reasonable for rsync to be slower
>than cp.
>
>Also consider that if cp were to acquire all of the enhancements that
>have been requested for cp as time has gone by then cp would be just
>as featureful (bloated!) as rsync and likely just as slow as rsync
>too.  This is something to consider every time someone asks for a
>creeping feature to cp.  Especially if they say they want the feature
>in cp because it is faster than rsync.  The natural progression is
>that cp would become rsync.
>
>> If rsync wasn't so slow at local I/O...*sigh*....
>
>The advantage of rsync is that it can be interrupted and restarted and
>the restarted process will efficiently avoid doing work that is
>already done.  An interrupted and restarted cp will perform the same
>work again from start to finish.
>
>If I am doing a simple copy from A to B then I use 'cp -av A B'.  If I
>am doing it the second time then I will use rsync to avoid repeating
>previously done work 'rsync -av A B'.
>
>If I want progress indication...  If I want placement of backup files
>in a particular directory...  If I want other fancy features that are
>provided by rsync then it is worth it to use rsync.
>
>  $ du -s coreutils
>  238920  coreutils
>  $ find coreutils -type f | wc -l
>  15013
>
>  $ rm -rf junk/coreutils
>  # echo 3 > /proc/sys/vm/drop_caches
>  $ time cp -a coreutils junk/
>  real    1m2.137s
>  user    0m0.140s
>  sys     0m1.724s
>
>  $ rm -rf junk/coreutils
>  $ time cp -a coreutils junk/
>  real    0m2.492s
>  user    0m0.060s
>  sys     0m1.064s
>
>  $ rm -rf junk/coreutils
>  # echo 3 > /proc/sys/vm/drop_caches
>  $ time rsync -a coreutils junk/
>  real    1m5.473s
>  user    0m1.280s
>  sys     0m2.112s
>
>  $ rm -rf junk/coreutils
>  $ time rsync -a coreutils junk/
>  real    0m3.215s
>  user    0m1.184s
>  sys     0m1.536s
>
>For normal use cp is a little faster than rsync.  Or rather rsync is a
>little slower than cp.  But not enough to make a difference for
>typical operations.  Having the file system cache warmed up makes a
>*HUGE* difference.  Much larger than any other difference.  For copies
>that take hours to run I am probably going to value the restart
>ability more than raw speed.  YMMV.
>
>
>
>
>
>Bob
>
>
>
>
>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]