qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC 0/4] mirror: implement incremental and bitmap modes


From: Fiona Ebner
Subject: Re: [RFC 0/4] mirror: implement incremental and bitmap modes
Date: Wed, 6 Mar 2024 14:44:37 +0100
User-agent: Mozilla Thunderbird

Am 29.02.24 um 13:47 schrieb Fiona Ebner:
> Am 29.02.24 um 12:48 schrieb Vladimir Sementsov-Ogievskiy:
>> On 29.02.24 13:11, Fiona Ebner wrote:
>>>
>>> The iotest creates a new target image for each incremental sync which
>>> only records the diff relative to the previous mirror and those diff
>>> images are later rebased onto each other to get the full picture.
>>>
>>> Thus, it can be that a previous mirror job (not just background process
>>> or previous write) already copied a cluster, and in particular, copied
>>> it to a different target!
>>
>> Aha understand.
>>
>> For simplicity, let's consider case, when source "cluster size" = "job
>> cluster size" = "bitmap granularity" = "target cluster size".
>>
>> Which types of clusters we should consider, when we want to handle guest
>> write?
>>
>> 1. Clusters, that should be copied by background process
>>
>> These are dirty clusters from user-given bitmap, or if we do a full-disk
>> mirror, all clusters, not yet copied by background process.
>>
>> For such clusters we simply ignore the unaligned write. We can even
>> ignore the aligned write too: less disturbing the guest by delays.
>>
> 
> Since do_sync_target_write() currently doesn't ignore aligned writes, I
> wouldn't change it. Of course they can count towards the "done_bitmap"
> you propose below.
> 
>> 2. Clusters, already copied by background process during this mirror job
>> and not dirtied by guest since this time.
>>
>> For such clusters we are safe to do unaligned write, as target cluster
>> must be allocated.
>>
> 
> Right.
> 
>> 3. Clusters, not marked initially by dirty bitmap.
>>
>> What to do with them? We can't do unaligned write. I see two variants:
>>
>> - do additional read from source, to fill the whole cluster, which seems
>> a bit too heavy
>>
> 
> Yes, I'd rather only do that as a last resort.
> 
>> - just mark the cluster as dirty for background job. So we behave like
>> in "background" mode. But why not? The maximum count of such "hacks" is
>> limited to number of "clear" clusters at start of mirror job, which
>> means that we don't seriously affect the convergence. Mirror is
>> guaranteed to converge anyway. And the whole sense of "write-blocking"
>> mode is to have a guaranteed convergence. What do you think?
>>
> 
> It could lead to a lot of flips between job->actively_synced == true and
> == false. AFAIU, currently, we only switch back from true to false when
> an error happens. While I don't see a concrete issue with it, at least
> it might be unexpected to users, so it better be documented.
> 
> I'll try going with this approach, thanks!
> 

These flips are actually a problem. When using live-migration with disk
mirroring, it's good that an actively synced image stays actively
synced. Otherwise, migration could finish at an inconvenient time and
try to inactivate the block device while mirror still got something to
do which would lead to an assertion failure [0].

The IO test added by this series is what uses the possibility to sync to
"diff images" which contain only the delta. In production, we are only
syncing to a previously mirrored target image. Non-aligned writes are
not an issue later like with a diff image. (Even if the initial
mirroring happened via ZFS replication outside of QEMU).

So copy-mode=write-blocking would work fine for our use case, but if I
go with the "mark clusters for unaligned writes dirty"-approach, it
would not work fine anymore.

Should I rather just document the limitation for the combination "target
is a diff image" and copy-mode=write-blocking?

I'd still add the check for the granularity and target cluster size.
While also only needed for diff images, it would allow using background
mode safely for those.

Best Regards,
Fiona

[0]:
https://lore.kernel.org/qemu-devel/1db7f571-cb7f-c293-04cc-cd856e060c3f@proxmox.com/




reply via email to

[Prev in Thread] Current Thread [Next in Thread]