duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Multilevel backup


From: zga9uhnq4g
Subject: Re: [Duplicity-talk] Multilevel backup
Date: Fri, 25 Feb 2022 17:00:42 -0800
User-agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:91.0) Gecko/20100101 Thunderbird/91.6.1

I'm not an expert on rdiff-backup either, but let me try to describe my understanding of what it does.  While rdiff-backup can operate in a bandwidth efficient manner over a pipe (e.g. an ssh tunnel) to backup to or from a remote location, let me use the simplest case, described here, of backing up one local directory "foo" to another local directory "bar".

After running "rdiff-backup foo bar" at time T0, bar will end up a copy of the contents of foo at time T0, but will also contain the extra directory "rdiff-backup-data" for storing metadata.  Next after making some changes in the directory foo, and running the same command at time T1, bar will now end up a copy of the contents of foo at time T1, but the rdiff-backup-data directory will contain "reverse diffs" to recreate the contents of foo at time T0 from the contents of foo at time T1.  If we continue making changes in the directory foo, and run the same rdiff-backup command at times T2 and T3, we end up with bar being a copy of the contents of foo at time T3 and the rdiff-backup-data directory containing 3 sets of reverse diffs, one to recreate the contents of foo at time T2 from the contents of foo at time T3, one to recreate the contents of foo at time T1 from the contents of foo at time T2, and one to recreate the contents of foo at time T0 from the contents of foo at time T1.

As you can see, the bar directory is always a mirror of the foo directory at the time of the latest backup.  That means that for this local backup we can use the "cp" command to restore from the latest snapshot, as described here.  Of course we can also use the rdiff-backup command for restoring to (and we have to do that to restore to or from a remote location or for older snapshots), but it can be as efficient as cp when restoring content from the latest snapshot).
When, sometime after T3, we ask rdiff-backup to restore some content as it was at time T0 by specifying "--restore-as-of T0", rdiff-backup will start with the latest (T3) snapshot, convert it to the T2 snapshot using the reverse diffs from T3 to T2, then convert that to the T1 snapshot using the reverse diffs from T2 to T1, then convert that to the T0 snapshot using the reverse diffs from T1 to T0.

The benefits I see to the rdiff-backup approach are
  1. It combines the best features  of  a  mirror and  an incremental backup, and it supports unlimited incremental backups without the need for space consuming regular full backups.
  2. Manipulating the latest snapshot is the most efficient, while manipulating older snapshots gets more costly the farther back in time the snapshot is from
  3. (most interesting for this thread) manipulating the contents of a snapshot at time T requires access to the snapshots newer than time T, but never requires access to snapshots older than time T.  This means that I can always delete the oldest snapshots without affecting newer snapshots because old snapshots depend on newer snapshots, but new snapshots never depend on older snapshots (which seems to be the opposite of duplicity).
  4. It is bandwidth efficient because, like rsync, only differences are transmitted.
The only downside of rdiff-backup (for me), which is also the big benefit of duplicity, is that it doesn't support encryption, which (for me) is a requirement for off-site backups to cloud storage.

I make no guarantees that my description of what rdiff-backup does is accurate (I haven't read the code), but hopefully this helps.

    - Peter

On 2022-02-23 03:32, edgar.soldin--- via Duplicity-talk duplicity-talk-at-nongnu.org |duplicity-talk| wrote:
hey Peter,

i tried to understand what rdiff-backup does by reading, but do not seem to grasp it. being short on time to try it out, could you maybe explain in more detail?

thanks ..ede/duply.net

On 23.02.2022 06:36, zga9uhnq4g--- via Duplicity-talk wrote:
Another way to avoid long backup dependency chains might be to do what rdiff-backup <https://rdiff-backup.net/> does.  It uses reverse diffs so the current backup is just a copy of the source directory, but older increments are "reverse diffs". This makes it always possible to delete the oldest N backups because nothing depends on them.

I've always wondered why duplicity doesn't work that way (encrypted of course), but I assume there is a good reason (which I would love to hear).  I wonder if it could be an option.

     Peter

On 2022-02-22 03:32, edgar.soldin--- via Duplicity-talk duplicity-talk-at-nongnu.org |duplicity-talk| wrote:
hi Hakan,

comments inline

On 21.02.2022 20:39, Håkan T Johansson via Duplicity-talk wrote:
Hello,

I am wondering if it is possible to do multilevel backup with duplicity?
currently no.

(In order to avoid the long backup chains, without having to frequent full new backups.)

If not, I suppose what would be needed would be an option to base some incremental backups on not the latest incremental but an earlier one (or the full backup).
the easiest implementation would probably be a way to base a new chain on a previous full. maybe via dummy full volume/metadata file, that just points at the old one.

What I'm contemplating is to e.g. let the first backup every month be incremental based on the previous month's first backup.  And the same for weeks and days, with the first within each category either based on the previous such, or the latest of any category higher, if that is later.

Together with e.g. remove-older-than 14D, this could give a chain-tree like this (each backup based on the previous line, unless there is an arrow):

full
monthly
monthly
monthly
monthly
^ weekly
| weekly
| weekly
| weekly
| ^ daily
| | daily
| | daily
| | daily
| | daily
| | daily
| weekly
|   daily
|   daily
monthly
   ^ daily
   | daily
   | daily
   weekly
     daily
     daily
     daily

much to complex for my taste. apart from that good luck finding someone to implement it. feel free to offer to contribute it though :) ..regards ede/duply.net

_______________________________________________
Duplicity-talk mailing list
Duplicity-talk@nongnu.org
https://lists.nongnu.org/mailman/listinfo/duplicity-talk


_______________________________________________
Duplicity-talk mailing list
Duplicity-talk@nongnu.org
https://lists.nongnu.org/mailman/listinfo/duplicity-talk


_______________________________________________
Duplicity-talk mailing list
Duplicity-talk@nongnu.org
https://lists.nongnu.org/mailman/listinfo/duplicity-talk


reply via email to

[Prev in Thread] Current Thread [Next in Thread]