duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Questions regarding implementation


From: Cláudio Gil
Subject: Re: [Duplicity-talk] Questions regarding implementation
Date: Tue, 26 Jul 2016 23:19:40 +0100

Hi,

My comments inline.

No dia terça-feira, 26 de julho de 2016, Robert Hickman via Duplicity-talk <address@hidden> escreveu:
Hi, I'm considering solutions for backing up my web server to S3. This
tool looks promising but I have some questions.

According to the website duplicity only backs up files which have
changed after it's first run. How does the program detect changes and
what local storage overhead is associated? It will be running on a VPS
with very little free storage.

I cannot describe the used algorithm. But there is a parameter that controls the size of file chunks. So I guess duplicity can check each chunk for modification.
 

Secondly how is data transfer on the S3 backed implemented?  If I am
backing up a sizable volume of data and the connection is lost will it
resume or re-upload everything? In the first case can a partial upload
still be used to recover the data that was uploaded?

Volumes are created locally and uploaded to S3. Only when the backup is completed, the manifest and signatures file are uploaded. If a backup is interrupted, duplicity can resume from the last volume. Until a backup finishes, that is, until the manifest and signatures are uploaded the already uploaded volumes are useless and cannot be used to recover data. There is probably a way of doing that but duplicity, by default, will not even list that incomplete backup.


How, if at all, is the rsync algorithm used with S3? As far as I know
the service does not allow partial updates of objects.

There is no rsync to S3. Everything is done locally bases on the chain of signatures and manifest files. Any backend is dumb in the sense that is only needs to support LIST of paths, and GET and PUT of whole files.
 

When downloading, is it required to download everything to recover a
single file, or can I pull only what I need? Partial downloads are
essential for my use case.

You can control the volume size. That is the trade-off you have to make. The default of 25MB means that, to recover a single file, at least 1 manifest, 1 signature file, and 25MB need to be downloaded. That is the best case. In other cases you may need to download several signature files and volumes.

Note that signature files can have a size that is like 1% (made up) of the size of the backed up data. When backing up GB the signatures file can also be GB for the full backup.

Regards,
Cláudio


reply via email to

[Prev in Thread] Current Thread [Next in Thread]