rdiff-backup-users
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [rdiff-backup-users] Finding/designing a tar replacement


From: Randall Nortman
Subject: Re: [rdiff-backup-users] Finding/designing a tar replacement
Date: Thu, 25 Sep 2003 14:58:26 -0500
User-agent: Mutt/1.3.28i

On Thu, Sep 25, 2003 at 11:24:20AM -0700, Ben Escoto wrote:
> On Thu, 25 Sep 2003 11:39:00 -0500
> Randall Nortman <address@hidden> wrote:
[...]
> I tried looking for some information on loopback filesystems on the
> web, but nothing was at the right level of abstraction.  Do you know
> if this is what is basically happening:  when a certain block is
> requested, losetup retrieves that block from the file, decrypts it,
> and then hands it over.
> 
> There is no trouble finding the block because encryption algorithms
> don't do compression themselves, and the blocks lengths match up 1-1.
> So the 5th encrypted block of 32kB is also the 5th plaintext block of
> 32kB.
> 
> However, once you add compression into the mix, the blocks no longer
> match up 1-1, and you can no longer use a simple mapping like losetup.
> Of course, you could write a compressed filesystem into the loopback
> file, but it seems to me that encryption and compression should happen
> at the same level (at the block level?).
> 
> Please correct any of the above, it is mostly speculation.
[...]

You are correct on all points.  No compression is done by the loopback
driver, only encryption, so block sizes do match one-to-one.  You
could compress each file (using bzip2 or gzip) as you write it,
however, and that would compress the data, but not the filesystem
metadata.


> > Instead of actually using this method, you might consider simply
> > being inspired by it, and make a filesystem-like archive format
> > which is written directly to a file (without need of the loopback
> > driver) with block-level encryption and automatic sizing of the
> > file.  (Sounds like a disseration topic to me... Hmmm...)
> 
> Hopefully it won't be that complicated.  The format of tar and related
> archives seems to be very simple, so coming up with a better format
> shouldn't be that difficult. (?)  As for the filesystem part, if I
> don't end up using an existing file system, I think it will be enough
> that the format could in principle be used as a file system (someone
> else can write the kernel driver).

I think you're on the right track there.  Filesystems, as others have
pointed out, are generally designed for fast, read/write, random
access to a fixed-size block device.  If I understand your needs, you
need fast sequential write, potentially slow random write/update, and
fast random read to a variable-size byte stream, with metadata on each
object and compression/encryption of the whole thing.  This changes
the design parameters quite a bit, so traditional filesystem
structures will never fit the needs perfectly.  Better to go with just
an enhanced tar-like format, along the lines of your existing
description (compressed/encrypted objects strung together, index at
the end, metadata either in the index or inline with the objects).
Keep it simple but flexible.

But by providing a limited filesystem-like API to it (or a
full-fledged kernel filesystem driver) you don't gain anything in
terms of efficiency, but you gain a lot in terms of usability.  I
imagine that a directly mountable encrypted/compressed archive format
would stand a good chance of gradually replacing tarballs, at least
within the open source community.  (Make the encryption optional I'd
say, so that it can be used for public software distribution.)  I can
see many uses for it which go way beyond tarballs -- for example, it
would serve many purposes of loopback devices with the advantage of
compression and variable/dynamic sizing.

I do not believe you're going to find an existing format which will
fit your needs as well as something you design for the task, and this
seems a simple enough task (sans kernel filesystem driver, which would
be mildly annoying) that it's worth doing it right.  You might even
get me and some other folks to help out.  This is beginning to sound
like an interesting project. :)

Randall Nortman




reply via email to

[Prev in Thread] Current Thread [Next in Thread]