[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 11/13] mirror: Skip writing zeroes when target is already
From: |
Stefan Hajnoczi |
Subject: |
Re: [PATCH v4 11/13] mirror: Skip writing zeroes when target is already zero |
Date: |
Mon, 12 May 2025 14:43:56 -0400 |
On Fri, May 09, 2025 at 03:40:28PM -0500, Eric Blake wrote:
> When mirroring, the goal is to ensure that the destination reads the
> same as the source; this goal is met whether the destination is sparse
> or fully-allocated (except when explicitly punching holes, then merely
> reading zero is not enough to know if it is sparse, so we still want
> to punch the hole). Avoiding a redundant write to zero (whether in
> the background because the zero cluster was marked in the dirty
> bitmap, or in the foreground because the guest is writing zeroes) when
> the destination already reads as zero makes mirroring faster, and
> avoids allocating the destination merely because the source reports as
> allocated.
>
> The effect is especially pronounced when the source is a raw file.
> That's because when the source is a qcow2 file, the dirty bitmap only
> visits the portions of the source that are allocated, which tend to be
> non-zero. But when the source is a raw file,
> bdrv_co_is_allocated_above() reports the entire file as allocated so
> mirror_dirty_init sets the entire dirty bitmap, and it is only later
> during mirror_iteration that we change to consulting the more precise
> bdrv_co_block_status_above() to learn where the source reads as zero.
>
> Remember that since a mirror operation can write a cluster more than
> once (every time the guest changes the source, the destination is also
> changed to keep up), and the guest can change whether a given cluster
> reads as zero, is discarded, or has non-zero data over the course of
> the mirror operation, we can't take the shortcut of relying on
> s->target_is_zero (which is static for the life of the job) in
> mirror_co_zero() to see if the destination is already zero, because
> that information may be stale. Any solution we use must be dynamic in
> the face of the guest writing or discarding a cluster while the mirror
> has been ongoing.
>
> We could just teach mirror_co_zero() to do a block_status() probe of
> the destination, and skip the zeroes if the destination already reads
> as zero, but we know from past experience that extra block_status()
> calls are not always cheap (tmpfs, anyone?), especially when they are
> random access rather than linear. Use of block_status() of the source
> by the background task in a linear fashion is not our bottleneck (it's
> a background task, after all); but since mirroring can be done while
> the source is actively being changed, we don't want a slow
> block_status() of the destination to occur on the hot path of the
> guest trying to do random-access writes to the source.
>
> So this patch takes a slightly different approach: any time we have to
> track dirty clusters, we can also track which clusters are known to
> read as zero. For sync=TOP or when we are punching holes from
> "detect-zeroes":"unmap", the zero bitmap starts out empty, but
> prevents a second write zero to a cluster that was already zero by an
> earlier pass; for sync=FULL when we are not punching holes, the zero
> bitmap starts out full if the destination reads as zero during
> initialization. Either way, I/O to the destination can now avoid
> redundant write zero to a cluster that already reads as zero, all
> without having to do a block_status() per write on the destination.
>
> With this patch, if I create a raw sparse destination file, connect it
> with QMP 'blockdev-add' while leaving it at the default "discard":
> "ignore", then run QMP 'blockdev-mirror' with "sync": "full", the
> destination remains sparse rather than fully allocated. Meanwhile, a
> destination image that is already fully allocated remains so unless it
> was opened with "detect-zeroes": "unmap". And any time writing zeroes
> is skipped, the job counters are not incremented.
>
> Signed-off-by: Eric Blake <eblake@redhat.com>
>
> ---
>
> v3: also skip counters when skipping I/O [Sunny]
> v4: rebase to earlier changes
> ---
> block/mirror.c | 107 ++++++++++++++++++++++++++++++++++++++++++-------
> 1 file changed, 93 insertions(+), 14 deletions(-)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
signature.asc
Description: PGP signature
- Re: [PATCH v4 09/13] mirror: Drop redundant zero_target parameter, (continued)
- [PATCH v4 10/13] mirror: Skip pre-zeroing destination if it is already zero, Eric Blake, 2025/05/09
- [PATCH v4 08/13] mirror: Allow QMP override to declare target already zero, Eric Blake, 2025/05/09
- [PATCH v4 11/13] mirror: Skip writing zeroes when target is already zero, Eric Blake, 2025/05/09
- [PATCH v4 13/13] tests: Add iotest mirror-sparse for recent patches, Eric Blake, 2025/05/09
- [PATCH v4 12/13] iotests/common.rc: add disk_usage function, Eric Blake, 2025/05/09
- Re: [PATCH v4 00/13] Make blockdev-mirror dest sparse in more cases, Stefan Hajnoczi, 2025/05/12
- [PATCH v4 14/13] mirror: Reduce I/O when destination is detect-zeroes:unmap, Eric Blake, 2025/05/13