duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Sliding window backup strategy?


From: Lars Marowsky-Bree
Subject: Re: [Duplicity-talk] Sliding window backup strategy?
Date: Sun, 5 Oct 2008 20:37:59 +0200
User-agent: Mutt/1.5.17 (2007-11-01)

On 2008-10-05T12:37:39, Serge Wroclawski <address@hidden> wrote:

> Honestly, I'm not sure I entirely see the point when doing disk to
> disk backups.
> 
> What does a full backup buy you other than needing less files for a
> restore, and why is that important?

The point is that, with full+incremental, the incremental grows over
time, whereas more data of the last full backup becomes outdated.
(Stating the obvious.)

So to restore any given file or the full tree, I need more and more
data. Transferring more data is time consuming. 

It also means I become more and more prone to data corruption - if one
of the incrementals becomes lost, tough luck. A full backup basically
also serves as a checkpoint. (Compare how video compression inserts a
full frame once in a while.)

The remote store also becomes less space efficient.

So, even with incremental as the base strategy, full backups make sense
from time to time.

However, I rarely want to pay the full cost at once; that's fine when
I'm at home and doing a backup from my laptop to another disk over
night; but on the road, that doesn't work for me.

> When you remove archives older than a certain date, it reconsiles the
> data so you have a full copy of the backup, and full backups /are/
> expensive, so why perform them if you don't have to?

remove-older-than & remove-all-but-n-full are great and go into the
direction I wish to explore, and actually shows that quite a bit of the
support is already there - this discussion helps me to learn, thanks!

But they don't imply that the length of the current backup chain has an
upper bound, which is what I'd like to achieve.

True, re-storing files which haven't changed is pointless. So my sliding
full backup is not necessarily the best term. But possibly, when the
backup chain for a file gets longer than "N" hops (or the size of the
incremental sum approaches N% of the original file size, whatever hits
first), instead of storing of yet another incremental, it might make
sense to store the file in full?

Yes, the last paragraph by itself implies that the storage becomes less
space efficient on-disk, but more efficient to transfer for restore, and
more resilient.

And, possibly, consolidate "old volumes" if their efficiency - ie, data
that has been supplanted since - drops below M%.

How about that? Does that make more sense?

> With a D2D backup you're offloading the failure issue onto a lower
> level and the difference between accessing one file or 10 files is
> pretty small- after all you have the index to tell you which one(s)
> will be used to construct it.

This is perfectly true when storage is cheap in both dimension of volume
and bandwidth/latency, which doesn't always hold true.

> I guess my question to you is, what is the problem you hope this will
> solve?

I hope this mail helps explain my direction better. And yes, I realize
my direction has slightly changed, as I've begun to better understand
the problem ;-)

(What I think we are seeing here is an analoguous problem to the "log
sweeper" in log-structured filesystems.)


Regards,
    Lars

-- 
Teamlead Kernel, SuSE Labs, Research and Development
SUSE LINUX Products GmbH, GF: Markus Rex, HRB 16746 (AG Nürnberg)
"Experience is the name everyone gives to their mistakes." -- Oscar Wilde





reply via email to

[Prev in Thread] Current Thread [Next in Thread]