Re: [rdiff-backup-users] Activity

rdiff-backup-users

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Activity

From:	address@hidden
Subject:	Re: [rdiff-backup-users] Activity
Date:	Mon, 01 Aug 2011 14:02:10 +0100
User-agent:	Mozilla/5.0 (Windows NT 5.1; rv:5.0) Gecko/20110624 Thunderbird/5.0

David,

On 01/08/2011 10:14, address@hidden wrote:

Felix,

Im wondering if there is anyone developing on rdiff-backup atm.

as for me, you are asking the crucial question concerning rdiff-backup.

There has not been a lot of development activity on rdiff-backup in the
recent times, and in addition, there are some fatal bugs in rdiff, causing
f*cked-up repositories especially when backuping to windows-hosted
targets.

Some time ago, I mailed the current maintainer, I guess it was Andrew
Ferguson, on the maintainance state of rdiff-backup -- however I got no
answer.

Yes I think rdiff-backup is currently unmaintained. Anyone who wants totake it forward (and has the skills to do so, which unfortunately I havenot) might need to make a fork (which in due course could becomerdiff-backup2?)

Im asking because im
thinking of using it in larger scale and want to know if there is some
activity going on.

I also would like to use it in larger scale, as it is - to  the best of my
knowledge - the only free and flexible 4D-Backup-solution. However, I
experienced there are currently some caveats that I want to state here
(and propose some thoughts I had on what would make rdiff-backup more
useful).
   * The repository format. When recovering older files, rdiff-backup
really needs every single reverse delta, which is not only slow, but
also extremely fragile (if only one of those files is corrupted,
recovery will fail). A solution might be some additional,
larger-granularity reverse deltas that help speeding up recovery as well
as preserving integrity of "most of the timeline" even if some deltas
are corrupted.

Although using multiple delta files is slow if you are regressing backthrough many previous backup runs (which is very rare in practice,though of course very valuable when you need it), I don't see howcreating larger-granularity reverse-deltas would really make it morerobust, it would just make the archives bigger. Under normal circs Iexpect a reverse-delta file covering 10 backups would be not much lessthan 10x the size of each separate reverse-delta file. (It is differentif files have been backed up accidentally once and then removed from thearchive, accidents like this can certainly bloat an rdiff-backuprepository.) Although a single damaged reverse-delta file will 'break'backup recovery this will only apply for backups earlier than the dateof the reverse-delta file.

   * Missing operators on an existing repository. For use of rdiff-backup
in larger scale it should be possible to e.g.
     - merge time steps
     - delete timesteps and correct deltas accordingly
     - remove subtrees (sometimes one backups large data sets by accident)
     - lots more.

yes these would be helpful especially to correct backup mistakes whichcan permanently bloat a repository

   * Some bugs, especially on operating system independence. For example,
even though the issue was investigated sometime, it's still difficult to
use windows machines as target due to the "write only attribute on
folders" problem. Multiple users report mismatching hashes, and so on.

Yes, the best advice re a windows target seems to be: don't. I think youcan reliably use rdiff-backup.exe to backup windows data to a linuxtarget, though.

   * Maybe a dedicated network protocol would be nice (inspired by rsync),
but I think, this is less important.


and I would add:

 * ability to run a thorough verification of an rdiff-backup archive.
   The current verification process is flawed as has been discussed in
   earlier threads here. The best strategy at the moment is to run a
   verification for a date at or earlier than the earliest backup run
   date, and then to run one or two backups for dates between the
   earliest date and the current date, but although this provides 'high
   confidence' about the integrity of the overall archive it does not,
   at least from a theoretical point of view, guarantee that the full
   history of all files, whether currently present or deleted, can be
   recovered. The only way to get this at present is to run a separate
   verification for every previous backup run, which is not realistic
   for a long-standing repository.
 * add a switch to enable 'forced' regression of an archive. At present
   rdiff-backup will only regress an archive that it considers to be
   broken. (However you can work around this limitation.)


Overall, am unsure whether it is more appropriate
   - to learn from the experience the great rdiffproject gave us and use
the base operators from rdiff-backup to maybe rewrite a whole new thing
with the above issues fixed (especially with a less fragile repository)
or
   - to continue fixing bugs on the small way in a project that seems
unmaintained (unfortunately, I lack the pro-grade python skills in order
to do it right).

I hope to start a discussion here on this thoughts, please contribute :)

There was a discussion a while ago here and there was a strong view thatthe existing project should be fixed rather than a new one started, Isuppose because rdiff-backup as it stands is 99.5% perfect and anyproject, even if it fixed the 0.5%, is likely to introduce new bugs andfailings. But in either case it needs someone to take on theresponsibility and workload. I think Daniel Miller began some work on areplacement for rdiff-backup but I don't know where his project stands.

AFAIK the only other open source project like rdiff-backup is duplicity.It has slightly different objectives, uses forward-deltas and hasdifferent maintainers; maybe it is more actively maintained? But I valuethe reverse-diff approach of rdiff-backup because it means the mostrecent data is the most reliable and fastest to retrieve, and you cancontinue to build up data history (for years even) without having tostart over at regular intervals. I would feel nervous if I had a 3 yearbackup history but needed to use an original dataset and then 1000 dailyforward-diff files in order to get the latest backup of a file (which isusually what you need). With rdiff-backup, if you do start to run out ofspace, you can easily delete the older data without endangering morerecent backups.


Two other possibilities (neither of which I have tried) are:

 * use rsync (or scripts based on it such as rsnapshot) but store the
   backup datasets on a deduplication file system such as lessfs.
 * put the filesystem on top of lvm and just take and keep regular lvm
   snapshots, these can then be the backups. Recent linux kernels allow
   you to revert a volume to an earlier snapshot if required. I don't
   think this was an intended use of lvm snapshots, but it should work
   and be quick'n'easy too, though I don't think it could or should be
   used over a prolonged period because of space issues (and perhaps
   speed). Of course the backups remain in the same volume as the
   original data; they can be copied to another location but then they
   will each take up the full space of the data.


Dominic
http://www.timedicer.co.uk

[Prev in Thread]

Current Thread

[Next in Thread]

[rdiff-backup-users] Activity, Felix Rios, 2011/08/01
- Re: [rdiff-backup-users] Activity, mail, 2011/08/01
  - Re: [rdiff-backup-users] Activity, D. Kriesel, 2011/08/01
  - Re: [rdiff-backup-users] Activity, address@hidden <=
    - Re: [rdiff-backup-users] Activity, D. Kriesel, 2011/08/01
    - Re: [rdiff-backup-users] Activity, Robert Nichols, 2011/08/01
    - Re: [rdiff-backup-users] Activity, Wojciech Stryjewski, 2011/08/01
    - Re: [rdiff-backup-users] Activity, D. Kriesel, 2011/08/01
    - Re: [rdiff-backup-users] Activity, D. Kriesel, 2011/08/01
    - Re: [rdiff-backup-users] Activity, Wojciech Stryjewski, 2011/08/01
    - Re: [rdiff-backup-users] Activity, Joe Steele, 2011/08/01
    - Re: [rdiff-backup-users] Activity, Alexander Samad, 2011/08/02
    - Re: [rdiff-backup-users] Activity, Piotr Karbowski, 2011/08/05
- Re: [rdiff-backup-users] Activity, covici, 2011/08/01

Prev by Date: Re: [rdiff-backup-users] Activity
Next by Date: Re: [rdiff-backup-users] Activity
Previous by thread: Re: [rdiff-backup-users] Activity
Next by thread: Re: [rdiff-backup-users] Activity
Index(es):
- Date
- Thread