[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH 00/23] Add subcluster allocation to qcow2
From: |
Eric Blake |
Subject: |
Re: [RFC PATCH 00/23] Add subcluster allocation to qcow2 |
Date: |
Tue, 15 Oct 2019 11:05:23 -0500 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0 |
On 10/15/19 10:23 AM, Alberto Garcia wrote:
Hi,
this series adds a new feature to the qcow2 on-disk format called
"Extended L2 Entries", which allows us to do subcluster allocation.
This cover letter explains the reasons behind this proposal, the
changes to the on-disk format, test results and pending work. If you
are curious you can also have a look at previous discussions about
this feature:
=== Changes to the on-disk format ===
An L2 entry is 64 bits wide, with this format (for uncompressed
clusters):
63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
**<----> <--------------------------------------------------><------->*
Rsrved host cluster offset of data Reserved
(6 bits) (47 bits) (8 bits)
bit 63: refcount == 1 (QCOW_OFLAG_COPIED)
bit 62: compressed = 1 (QCOW_OFLAG_COMPRESSED)
bit 0: all zeros (QCOW_OFLAG_ZERO)
If Extended L2 Entries are enabled, bit 0 becomes reserved and must be
unset, and this 64-bit bitmap follows the entry:
63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
<---------------------------------> <--------------------------------->
subcluster reads as zeros subcluster is allocated
(32 bits) (32 bits)
I like the grouping - you can then do a 4-byte read and comparison to 0
to see if the entire cluster reads as zeroes or is unallocated.
With 32k clusters, this results in 1k subclusters. In cluster 1 (offset
32k), which bits map where? (The obvious choices are that sub-cluster
32k maps to bit 0, 33k maps to bit 1, ...; or that sub-cluster 32k maps
to bit 31, 33k maps to bit 30, ...)
/me reads ahead
okay, in patch 5, you said you map the most significant bit to the first
cluster. That feels backwards to me; I wonder if the math is any easier
if you map sub-clusters starting from the least-significant, because
then you get:
bit = (address >> cluster_size) & 32
rather than
bit = 31 - ((address >> cluster_size) & 32)
Some comments about the results:
- The smallest allowed cluster size for an image with subclusters is
16 KB (in this case the subclusters size is 512 bytes), hence the
missing values in the 4 KB and 8 KB rows.
Again reading ahead, I see that patch 5 requires a 16k minimum cluster
for using extended L2. Could we still permit clusters smaller than
that, but merely document that subclusters are always a minimum of 512
bytes and therefore for an 8k cluster we only use 16 bits (leaving the
other 16 bits zero)? But I'm also fine with the simplicity of just
stating that subclusters require at least 16k clusters.
=== To do ===
A couple of things are missing from this series:
- The ability to efficiently zero individual subclusters using
qcow2_co_pwrite_zeroes(). At the moment only full clusters can be
zeroed with this method.
- Alternatively we could get rid of the individual "all zeroes" bits
altogether and have 64 subclusters per cluster. We would still have
the QCOW_OFLAG_ZERO bit in the standard cluster descriptor.
I think you've got more flexibility with the two bits per sub-cluster
than you would with just 1 bit and 64 subclusters, so I don't think this
direction is going to get us far.
- The number of subclusters per cluster is always 32. It would be
trivial to allow configuring this, but I don't see any use case.
Agreed.
- Tests: I have a few written that I'll add in future revisions of
this series.
- handle_alloc_space() works at the subclusters level. That is, if you
have an unallocated 2MB cluster with 64KB subclusters, no backing
image and you write 4KB of data, QEMU won't write zeroes to the
affected subcluster(s) and will use handle_alloc_space() instead.
The other subclusters won't be touched and will remain unallocated.
This behavior is consistent with how subclusters work and saves disk
space, but offers slightly lower performance (see test results
above). Theoretically we could offer a setting to configure this,
but I'm not convinced that this is very useful.
===========================
As usual, feedback is welcome,
Looks promising!
How do subclusters interact with external data files?
--
Eric Blake, Principal Software Engineer
Red Hat, Inc. +1-919-301-3226
Virtualization: qemu.org | libvirt.org
- [RFC PATCH 20/23] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2(), (continued)
- [RFC PATCH 20/23] qcow2: Update L2 bitmap in qcow2_alloc_cluster_link_l2(), Alberto Garcia, 2019/10/15
- [RFC PATCH 02/23] qcow2: Split cluster_needs_cow() out of count_cow_clusters(), Alberto Garcia, 2019/10/15
- [RFC PATCH 14/23] qcow2: Add subcluster support to qcow2_get_cluster_offset(), Alberto Garcia, 2019/10/15
- [RFC PATCH 18/23] qcow2: Add subcluster support to expand_zero_clusters_in_l1(), Alberto Garcia, 2019/10/15
- [RFC PATCH 19/23] qcow2: Fix offset calculation in handle_dependencies(), Alberto Garcia, 2019/10/15
- [RFC PATCH 23/23] qcow2: Add the 'extended_l2' option and the QCOW2_INCOMPAT_EXTL2 bit, Alberto Garcia, 2019/10/15
- [RFC PATCH 13/23] qcow2: Add subcluster support to calculate_l2_meta(), Alberto Garcia, 2019/10/15
- [RFC PATCH 21/23] qcow2: Add subcluster support to handle_alloc_space(), Alberto Garcia, 2019/10/15
- [RFC PATCH 17/23] qcow2: Add subcluster support to check_refcounts_l2(), Alberto Garcia, 2019/10/15
- Re: [RFC PATCH 00/23] Add subcluster allocation to qcow2,
Eric Blake <=
- Re: [RFC PATCH 00/23] Add subcluster allocation to qcow2, Vladimir Sementsov-Ogievskiy, 2019/10/23