qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/2] block: posix: Always allocate the first


From: Max Reitz
Subject: Re: [Qemu-devel] [PATCH v2 1/2] block: posix: Always allocate the first block
Date: Mon, 26 Aug 2019 18:20:26 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 26.08.19 17:41, Nir Soffer wrote:
> On Mon, Aug 26, 2019 at 3:31 PM Max Reitz <address@hidden> wrote:
>>
>> On 26.08.19 00:03, Nir Soffer wrote:
> ...
>>> +/*
>>> + * Help alignment probing by allocating the first block.
>>> + *
>>> + * When reading with direct I/O from unallocated area on Gluster backed by 
>>> XFS,
>>> + * reading succeeds regardless of request length. In this case we fallback 
>>> to
>>> + * safe alignment which is not optimal. Allocating the first block avoids 
>>> this
>>> + * fallback.
>>> + *
>>> + * fd may be opened with O_DIRECT, but we don't know the buffer alignment 
>>> or
>>> + * request alignment, so we use safe values.
>>> + *
>>> + * Returns: 0 on success, -errno on failure. Since this is an optimization,
>>> + * caller may ignore failures.
>>> + */
>>> +static int allocate_first_block(int fd, size_t max_size)
>>> +{
>>> +    size_t write_size = MIN(MAX_BLOCKSIZE, max_size);
>>
>> Hm, well, there was a reason why I proposed rounding this down to the
>> next power of two.  If max_size is not a power of two but below
>> MAX_BLOCKSIZE, write_size will not be a power of two, and thus the write
>> below may fail even if write_size exceeds the physical block size.
>>
>> You can see that in the test case you add by using e.g. 768 as the
>> destination size (provided your test filesystem has a block size of 512).
>>
>> Now I would like to say that it’s stupid to resize an O_DIRECT file to a
>> size that is not a multiple of the block size; but I’ve had a bug
>> assigned to me before because that didn’t work.
>>
>> But maybe it’s actually better if it doesn’t work.  I don’t know.
> 
> I tried to avoid complexity that is unlikely to help anyone, but we
> can make the (typical)
> case of 512 bytes sector size work with this:
> 
>     size_t write_size = (max_size < MAX_BLOCKSIZE)
>         ? BDRV_SECTOR_SIZE
>         : MAX_BLOCKSIZE;
> 
> Unfortunately testing max_size < 4096 will not be reliable since we don't know
> that underlying storage sector size.

Hm, well, why not, actually.  That’s simple enough and it should work in
all common configurations.

Max

Attachment: signature.asc
Description: OpenPGP digital signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]