Re: [rdiff-backup-users] Q. on max-file-size behavior

rdiff-backup-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Q. on max-file-size behavior

From:	Maarten Bezemer
Subject:	Re: [rdiff-backup-users] Q. on max-file-size behavior
Date:	Sun, 14 Mar 2010 15:31:13 +0100 (CET)


On Sat, 13 Mar 2010, Whit Blauvelt wrote:

On Sat, Mar 13, 2010 at 11:58:42PM +0100, Jernej Simonÿÿiÿÿ wrote:

I'd say this is expected behaviour - the destination saw the file on
previous run, but didn't see it on current run (because the source
likely doesn't inform it about files it skips), so it treats the file
as deleted on source.


Probably so. A corner case then. Even though it would be easy for the source
to inform it about files skipped and avoid this, it's probably not worth the
coding effort.

I don't think this is even a corner case. If you want to exclude largefiles, then a file that is larger than the limit you specify (somethingyou explicitly and deliberatly do!) should not be in the backup. Also, itshould not _remain_ in the 'current' backup tree, because it would nolonger match the original in the source tree.Since rdiff-backup keeps history of the backups, there is no other waythan to treat it as 'deleted from the source'. That's the only way to keepthe history intact AND have a proper 'current' backup tree.

Another question comes up though. If gzip'ing a huge file can cause a
resonably fast machine to tie up considerable resources for > 30 minutes
because it's logic tells it it's time to gzip a 16g file, it would be good
if there's a way to ask it not to do that.


Why would it?

If you want to remove a file from the backup (including the history), feelfree to add wishlist-items for patches or external tools to accomplishthat. Aside from that, you could also run rdiff-backup with nice and/orionice so it wouldn't "tie up" resources.(BTW, spending 30 minutes on a 16GB file, I don't think that would be sostrange. Even md5sum-ing a 4.7GB iso image can take a few minutes on abusy system with lots of disk i/o.)

I see that compression can be
turned off for all files, but not how to turn compression off just for the
largest files. Is there some trick that would accomplish that? Basically,
compression on smaller files is always good; compression on the very largest
files almost always bad; and somewhere in between - depending on system
resources - it gets iffy. It would be useful to have a flag to set a
file-size threshold where only files below that would compress.


These are quite strong claims without any proof or supporting theory.

Compressing a 7KB file might indeed make it considerably smaller, supposeit would be 4.1K when zipped. But on file systems with 4KB blocks, thatwould not even save 1 block. And filesystems supporting multiple 16GBfiles tend to have larger block sizes...Larger files on the other hand can often be compressed with much largerspace-savings. As always, it all depends on the type of data in the files,so YMMV.

Contrary to what you suggest, I could think of two wishlist-items thatwould make more sense. And I'm not even posting them as wishlist-items asI don't think they would be worth implementing.

1) limit the (cpu) time spent on compressing a file, and leave the file
   uncompressed when it takes too long. Heck, maybe even make it a
   user-configurable duration.
2) if compressing is taking longer than X seconds/minutes, check if
   compression is doing any good (check compression ratio for the part of
   the file that has already been processed) and leave the file
   uncompressed when the ratio suggests it wouldn't be worth continuing
   the compression process.

Both of these would not help me with the disk image files I have here.Those tend to have large space-savings at the end of the file. But thenagain, I wouldn't use rdiff-backup on them anyway.



Just my 2 cents.

Maarten

[Prev in Thread]

Current Thread

[Next in Thread]

[rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/13
- Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/13
  - Re: [rdiff-backup-users] Q. on max-file-size behavior, Jernej Simončič, 2010/03/13
    - Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/13
  - Re: [rdiff-backup-users] Q. on max-file-size behavior, Josh Nisly, 2010/03/13
    - Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/13
    - Re: [rdiff-backup-users] Q. on max-file-size behavior, Jernej Simončič, 2010/03/13
    - Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/13
    - Re: [rdiff-backup-users] Q. on max-file-size behavior, Maarten Bezemer <=
    - Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/14
    - Re: [rdiff-backup-users] Q. on max-file-size behavior, Maarten Bezemer, 2010/03/14
    - Re: [rdiff-backup-users] Q. on max-file-size behavior, Whit Blauvelt, 2010/03/14

Prev by Date: Re: [rdiff-backup-users] Q. on max-file-size behavior
Next by Date: Re: [rdiff-backup-users] Q. on max-file-size behavior
Previous by thread: Re: [rdiff-backup-users] Q. on max-file-size behavior
Next by thread: Re: [rdiff-backup-users] Q. on max-file-size behavior
Index(es):
- Date
- Thread