duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Duplicity-talk] Bug report: Asynchronous upload not working properly


From: Robin Munn
Subject: [Duplicity-talk] Bug report: Asynchronous upload not working properly
Date: Thu, 4 Jun 2009 15:22:59 -0500

Hi all,

I tried to submit this as a bug report in Savannah, but the Savannah
tracker requires that I join the duplicity project to submit a bug
report. That's not automatic, so I joined the mailing list instead to
post this bug report. (With my AD/HD, if I had to submit a request for
joining then wait for it to be processed, I'd forget about the bug
report by the time the request went through; this way you at least get
the report, and someone can add it to the tracker if it's considered
important).

The bug I observed is that the --asynchronous-upload option isn't
working the way I think it should on my system, where the limiting
factor is bandwidth. (I.e., it takes a lot longer to upload a 50
megabyte file through my ADSL connection than it does to prepare the
next file to be uploaded). I believe the intended behavior is:

* Volume 1 is prepared.
* Volume 1 finishes preparing, starts uploading.
* While volume 1 is uploaded, volume 2 is prepared.
* Volume 2 finishes preparing; volume 1 is still uploading, so
Duplicity waits for the upload to complete.
* Volume 1 finishes uploading; volume 2 immediately starts uploading.
* While volume 2 is uploaded, volume 3 is prepared.

Etc., etc., repeating the last three steps until all volumes are
complete. This means that the upload bandwidth is being used almost
constantly; by the time one volume has finished uploading, another
volume is ready and waiting in /tmp.

However, that's not what's actually happening. I just launched a
duplicity backup using the following command (with a real username and
server name, of course):

duplicity /home/username/projects
scp://address@hidden/backups/projects/
--asynchronous-upload --verbosity 4 --volsize 50

I then opened a WinSCP view to watch the backup upload (since I don't
know which verbosity level would give me an "uploaded ### out of ###
bytes" display), while doing an "ls -l" of
/tmp/duplicity-xyzzyx-tempdir/ so I could watch the files being
created. And I noticed the pattern was:

* Volume 1 is prepared.
* Volume 1 finishes preparing, starts uploading.
* While volume 1 is uploaded, volume 2 is prepared.
* Volume 2 finishes preparing; volume 1 is still uploading, so
Duplicity waits for the upload to complete.
* Volume 1 finishes uploading; volume 2 immediately starts uploading.
* While volume 2 is uploaded, NOTHING is prepared.
* Volume 2 finishes uploading; now volume 3 is prepared. (Now NOTHING
is being uploaded while volume 3 is prepared).
* Volume 3 finishes preparing, starts uploading.
* While volume 3 is uploaded, volume 4 is prepared.
* Volume 4 finishes preparing; volume 3 is still uploading, so
Duplicity waits for the upload to complete.
* Volume 3 finishes uploading; volume 4 immediately starts uploading.
* While volume 4 is uploaded, NOTHING is prepared.
* Volume 4 finishes uploading; now volume 5 is prepared. (Now NOTHING
is being uploaded while volume 5 is prepared).

Etc., etc., etc., until all volumes are complete.

This is not a major bug, but it does mean that the upload bandwidth
isn't being used as efficiently as it should be, which is what
--asynchronous-upload was meant for. As it currently stands, all that
--asynchronous-upload does is that uploads happen in "batches" of 2
volumes rather than "batches" of 1 volume, but uploads still have to
wait while the next "batch" is prepared. (Or at least, until the first
volume of the next "batch" is prepared).

I'm sure this is repeatable, though the problem won't show up if your
upload bandwidth is faster than the preparation of the next volume. If
you run this test on an internal gigabyte network, where preparing the
next volume is the bottleneck rather than bandwidth being the
bottleneck, you probably won't notice any difference. But run a test
using an external server, so that your upload bandwidth is the
bottleneck, and you should observe the same pattern I did: volumes
being uploaded in "batches" of 2, with a gap in between for the next
batch to be prepared.

-- 
Robin Munn
address@hidden
GPG key 0x4543D577




reply via email to

[Prev in Thread] Current Thread [Next in Thread]