rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[rdiff-backup-users] How much metadata to store


From: Ben Escoto
Subject: [rdiff-backup-users] How much metadata to store
Date: Mon, 02 Dec 2002 13:47:45 -0800

Originally I had envisaged only saving the metadata which could not be
stored on the mirror system.  But Bud and Dave have noted some
advantages to storing metadata for all mirror files:

DS> ... eliminates stat/open/close of a billion files

BB> A metadata approach seems to open the way for compressed
BB> mirrors... 

So I wonder how desirable these features would be to people, or if
anyone can think of other things to do with more metadata.

    About the stat/open/close billion files issue, note that
rdiff-backup doesn't actually open files on the mirror side if their
mtime, size, etc compare correctly.  But saving metadata could save
many stat operations.  However, I wonder how much of a difference this
would make in practice.  For me rdiff-backup seems mainly CPU bound.
Is anyone suffering because stat() operations on the mirror side are
taking too long?

    About the compressed mirror idea, this doesn't seem to necessarily
need extra metadata.  If we assumed a file whose mtime was the same
hasn't changed, we could do this without extra metadata, just by
setting the mtime of the compressed mirror copy to the original file's
mtime.  But saving other information does let us compare file size.

    Also when we start compressing the mirror files, it seems to take
us away from the whole mirror concept.  Why not then just use
something like duplicity, which already compresses everything, and
which because it writes 5MB volumes instead of storing each file
separately, doesn't use a minimum of 16k or whatever per file,
even if the file is, say, 3 bytes.  Also a compressed file system is
another way to get a compressed mirror.  At least one rdiff-backup
user is already doing things this way.

    So I'd like to get some idea how desirable these features are
before adding extra complexity and corruption risk.


-- 
Ben Escoto

Attachment: pgp3NtZf5xO4Y.pgp
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]