qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Libguestfs] [libnbd PATCH v3 03/22] protocol: Add definitions for e


From: Laszlo Ersek
Subject: Re: [Libguestfs] [libnbd PATCH v3 03/22] protocol: Add definitions for extended headers
Date: Thu, 1 Jun 2023 05:33:20 +0200

On 5/31/23 18:04, Eric Blake wrote:
> On Wed, May 31, 2023 at 01:29:30PM +0200, Laszlo Ersek wrote:
>>>> Putting aside alignment even, I don't understand why reducing "count" to
>>>> uint16_t would be reasonable. With the current 32-bit-only block
>>>> descriptor, we already need to write loops in libnbd clients, because we
>>>> can't cover the entire remote image in one API call [*]. If I understood
>>>> Eric right earlier, the 64-bit extensions were supposed to remedy that
>>>> -- but as it stands, clients will still need loops ("chunking") around
>>>> block status fetching; is that right?
>>>
>>> While the larger extents reduce the need for looping, it does not
>>> entirely eliminate it.  For example, just because the server can now
>>> tell you that an image is entirely data in just one reply does not
>>> mean that it will actually do so - qemu in particular limits block
>>> status of a qcow2 file to reporting just one cluster at a time for
>>> consistency reasons, where even if you use the maximum size of 2M
>>> clusters, you can never get more than (2M/16)*2M = 256G status
>>> reported in a single request.
>>
>> I don't understand the calculation. I can imagine the following
>> interpretation:
>>
>> - QEMU never sends more than 128K block descriptors, and each descriptor
>> covers one 2MB sized cluster --> 256 GB of the disk covered in one go.
>>
>> But I don't understand where the (2M/16) division comes from, even
>> though the quotient is 128K.
> 
> Ah, I need to provide more backstory on the qcow2 format.  A qcow2
> image has a fixed cluster size, chosen between between 512 and 2M
> bytes.  A smaller cluster size has less wasted space for small images,
> but uses more overhead.  Each cluster has to be stored in an L1 map,
> where pages of the map are also a cluster in length, with 16 bytes per
> map entry.  So if you pick a cluster size of 512, you get 512/16 or 32
> entries per L1 page; if you pick a cluster size of 2M, you get 2M/16
> or 128k entries per L1 page.  When reporting block status, qemu reads
> at most one L1 page to then say how each cluster referenced from that
> page is mapped.
> 
> https://gitlab.com/qemu-project/qemu/-/blob/master/docs/interop/qcow2.txt#L491
> 
>>
>> I can connect the constant "128K", and
>> <https://github.com/NetworkBlockDevice/nbd/commit/926a51df>, to your
>> paragraph [*] above, but not the division.
> 
> In this case, the qemu limit on reporting block status of at most one
> L1 map page at a time happens to have no relationship to the NBD
> constant of limiting block status reports to no more than 1M extents
> (8M bytes) in a single reply, nor the fact that qemu picked a cap of
> 1M bytes (128k extents) on its NBD reply regardless of whether the
> underlying image is qcow2 or some other format.

Thanks!

[...]

Laszlo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]