duplicity-talk
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Duplicity-talk] Incremental backup of files with changed data but u


From: Kenneth Loafman
Subject: Re: [Duplicity-talk] Incremental backup of files with changed data but unchanged timestamp
Date: Mon, 4 Aug 2014 09:15:51 -0500

I'm going to punt on this for a bit.  Horribly busy today.  The place I chose was the simplest.  The outermost may involve setting flags to accomplish much the same thing.  I'm not sure at this point.  This thing is a pile of nested iterators.  At one point in debugging I was an even dozen deep into iteration and needed a scorecard to keep track of things.  Going to need a chunk of time to work through it again.

...Ken



On Sun, Aug 3, 2014 at 1:05 PM, <address@hidden> wrote:
On 03.08.2014 17:57, Kenneth Loafman wrote:
> ROPath.__eq__ is used any time two ROPath objects are compared for
> equality.  Any change there will mean that the files will compare unequal.
> The places I'd put code to enforce data comparison are around lines 323-326
> where it checks perms and times.  Even then, I'd qualify that with a check
> for self.isreg() so we limit the option to regular files only.


do you think it is wise to place the switch there? who knows how often file objects are compared. essentially we want _only_ the decision if to backup to be modified.

>
> Yes, if the delta is created, the file has changed.

wouldn't it be the other way around with this mod? because we don't check attributes we would always create a delta, which in turn will create a backup with all files changed but zero sized deltas (except for the mtime files really modified with proper deltas)?

..ede

>
> ...Ken
>
>
>
> On Sun, Aug 3, 2014 at 8:05 AM, <address@hidden> wrote:
>
>> Ken,
>>
>> can you as well point me to where the path.py _eq_() method (L325) is
>> called during an incremental backup? that would be the place, where i'd put
>> a --compare-data switch to enforce data comparison.
>>
>> but, even if we ignore mtime and _always_ create deltas, wouldn't the
>> current design (afaics) assume the file has changed, create a new signature
>> and save this to tar - for every file - always?
>> i didn't see a check for when delta is zero. instead duplicity seems to
>> assume that when a delta is created the file has to have changed.
>>
>> ..ede
>>
>> On 03.08.2014 14:31, Kenneth Loafman wrote:
>>> We don't need to do the librsync create delta, we just need to ignore the
>>> timestamp and duplicity will do that for us.
>>>
>>> duplicity.py, dup_time.py, path.py, and tarfile.py are the ones that
>>> actually reference mtime.  path.py would be the place to look for
>>> comparison, well before we call librsync.  Take a look in path.py, line
>> 325
>>> for backup and line 372 for verify.  Replace with 'return 0' and you will
>>> alway go through the rdiff process.  Very expensive.
>>>
>>> ...Ken
>>>
>>>
>>> On Sun, Aug 3, 2014 at 6:39 AM, <address@hidden> wrote:
>>>
>>>> hmm, some more searching didn't reveal no options for the librsync
>> create
>>>> delta call. it simply seems to create signatures for the whole file
>> only.
>>>>
>> http://librsync.sourcefrog.net/doc/librsync.html#processing-whole-files
>>>>
>>>> that suggests that the mtime is compared somewhere else, probably in
>>>>
>>>>
>> http://bazaar.launchpad.net/~duplicity-team/duplicity/0.7-series/view/head:/duplicity/path.py
>>>> although i am absolutely clueless as to where in the code path this is
>>>> supposed to happen.
>>>>
>>>> @Ken, Mike: any (more) input?
>>>>
>>>> after all this (identical mtime) comes up from time to time on the list
>>>> e.g.
>>>>
>>>>
>> https://lists.nongnu.org/archive/html/duplicity-talk/2013-07/msg00015.html
>>>> rsync allows to enforce checksum checking '-c' as well, so people
>> probably
>>>> will expect this from duplicity.
>>>>
>>>> ..ede
>>>>
>>>>
>>>> On 03.08.2014 13:05, Kenneth Loafman wrote:
>>>>> I've seen packages that have the timestamp reflect the version number,
>> so
>>>>> he's probably right, it would be the packager doing the dirty trick.
>>>>>
>>>>> I'm fairly sure you are right that DeltaFile is the first place.  I
>> could
>>>>> not find anything else.  Mod that and he should be good to go.  It will
>>>> be
>>>>> a lot slower, so save the original for the next backup.
>>>>>
>>>>> ...Ken
>>>>>
>>>>>
>>>>>
>>>>> On Sun, Aug 3, 2014 at 5:11 AM, <address@hidden> wrote:
>>>>>
>>>>>> On 03.08.2014 02:03, Nate Eldredge wrote:
>>>>>>> I am using duplicity to make incremental backups of my system.  I
>> have
>>>>>> some files whose data has changed since the last backup, but whose
>> mtime
>>>>>> stayed the same.  It looks like `duplicity incremental' ignores files
>>>> whose
>>>>>> timestamp has not changed, so it doesn't back up the new data.  Is
>>>> there a
>>>>>> way to force duplicity to compare the file with a stored checksum, or
>>>> even
>>>>>> to use rdiff unconditionally?  I'd prefer not to have to do a new full
>>>>>> backup.
>>>>>>>
>>>>>>> I'd consider hacking duplicity myself but it would be helpful to know
>>>>>> where in the code I should look.
>>>>>>>
>>>>>>> (Before you accuse me of abusing timestamps: it isn't my fault!  I
>>>>>> crossgraded this Ubuntu system from 32-bit to 64-bit.  It appears that
>>>> some
>>>>>> Ubuntu packages have the same timestamps on corresponding files in the
>>>>>> 32-bit and 64-bit versions.  Presumably the packages were generated at
>>>> the
>>>>>> same time, and coincidentally those files were compiled during the
>> same
>>>>>> second.  So when I replaced the 32-bit package with the 64-bit
>> package,
>>>> I
>>>>>> get a different file with the same timestamp.)
>>>>>>>
>>>>>>> I'm using duplicity 0.6.23 (latest from the PPA) on Ubuntu 14.04.
>>>>>>>
>>>>>>
>>>>>> i like "(Before you accuse me of abusing timestamps: it isn't my
>> fault!"
>>>>>> bit .. hehe as long as the time stamps were old enough you will get
>> off
>>>>>> scott free i guess..
>>>>>>
>>>>>> but seriously - this was obviously not on the horizon of when
>> duplicity
>>>>>> was developed. i searched a bit but couldn't find anything apart from
>>>> the
>>>>>> librsync call 'librsync.DeltaFile(old_sigfp, newfp)' in
>>>>>>
>>>>>>
>>>>
>> http://bazaar.launchpad.net/~duplicity-team/duplicity/0.7-series/view/head:/duplicity/diffdir.py#L136
>>>>>>
>>>>>> i cannot seem to find a routine that checks time stamps before that.
>>>>>>
>>>>>> @Ken, Mike: can you hint where this magic happens?
>>>>>>
>>>>>> ..ede
>>>>>>
>>>>>
>>>>
>>>
>>
>


reply via email to

[Prev in Thread] Current Thread [Next in Thread]