|
From: | Vladimir Sementsov-Ogievskiy |
Subject: | Re: [Qemu-block] [PATCH 2/2] block/backup: fix fleecing scheme: use serialized writes |
Date: | Wed, 4 Jul 2018 15:31:09 +0300 |
User-agent: | Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 |
04.07.2018 13:39, Kevin Wolf wrote:
Am 03.07.2018 um 20:07 hat Vladimir Sementsov-Ogievskiy geschrieben:Fleecing scheme works as follows: we want a kind of temporary snapshot of active drive A. We create temporary image B, with B->backing = A. Then we start backup(sync=none) from A to B. From this point, B reads as point-in-time snapshot of A (A continues to be active drive, accepting guest IO). This scheme needs some additional synchronization between reads from B and backup COW operations, otherwise, the following situation is theoretically possible: (assume B is qcow2, client is NBD client, reading from B) 1. client starts reading and take qcow2 mutex in qcow2_co_preadv, and goes up to l2 table loading (assume cache miss) 2) guest write => backup COW => qcow2 write => try to take qcow2 mutex => waiting 3. l2 table loaded, we see that cluster is UNALLOCATED, go to "case QCOW2_CLUSTER_UNALLOCATED" and unlock mutex before bdrv_co_preadv(bs->backing, ...) 4) aha, mutex unlocked, backup COW continues, and we finally finish guest write and change cluster in our active disk A 5. actually, do bdrv_co_preadv(bs->backing, ...) and read _new updated_ data. To avoid this, let's make all COW writes serializing, to not intersect with reads from B. Signed-off-by: Vladimir Sementsov-Ogievskiy <address@hidden>I think this should work, even though we can still improve it a bit. First, I don't think we need to disable copy offloading as long as we add BDRV_REQ_SERIALISING support to bdrv_co_copy_range_internal(). The thing that's a bit strange there is that there is only a single flags parameter for both source and target. We may need to split that so that we can pass BDRV_REQ_NO_SERIALISING for the source and BDRV_REQ_SERIALISING for the target.
Ok
+ /* Detect image fleecing */ + fleecing = sync_mode == MIRROR_SYNC_MODE_NONE && target->backing->bs == bs;The other thing is that I'm not convinced of the condition here. This is the narrowest thinkable condition to recognise the exact fleecing setup we have in mind right now. However, it is not the condition to flag all setups that are affected by the same problem. I don't think sync_mode actually makes a meaningful difference here. If someone wants to create a full backup and already access it while the job is still in progress, they will still want it to be consistent. It also doesn't make a difference whether the fleecing overlay has the source directly attached as a backing node or whether there is a filter node between them. So while I'm not sure whether it really covers all interesting cases, this condition would already be a lot better: fleecing = bdrv_chain_contains(target, bs); Maybe rename the variable to serialise_target_writes because it's no longer restricted to the exact fleecing case.
Ok
+ if (fleecing) { + if (compress) { + error_setg(errp, "Image fleecing doesn't support compressed mode."); + return NULL; + } + }Why not fleecing && compress instead of a nested if?
hmm, really strange
And why is fleecing even at odds with compression?
I thought that compressed writes goes some other way, now I see: they don't. Will drop the restriction
Kevin
Thank you for the review, I'll resend soon. -- Best regards, Vladimir
[Prev in Thread] | Current Thread | [Next in Thread] |