qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v2 1/2] block: posix: Always allocate the first


From: Nir Soffer
Subject: Re: [Qemu-devel] [PATCH v2 1/2] block: posix: Always allocate the first block
Date: Mon, 26 Aug 2019 18:41:59 +0300

On Mon, Aug 26, 2019 at 3:31 PM Max Reitz <address@hidden> wrote:
>
> On 26.08.19 00:03, Nir Soffer wrote:
...
> > +/*
> > + * Help alignment probing by allocating the first block.
> > + *
> > + * When reading with direct I/O from unallocated area on Gluster backed by 
> > XFS,
> > + * reading succeeds regardless of request length. In this case we fallback 
> > to
> > + * safe alignment which is not optimal. Allocating the first block avoids 
> > this
> > + * fallback.
> > + *
> > + * fd may be opened with O_DIRECT, but we don't know the buffer alignment 
> > or
> > + * request alignment, so we use safe values.
> > + *
> > + * Returns: 0 on success, -errno on failure. Since this is an optimization,
> > + * caller may ignore failures.
> > + */
> > +static int allocate_first_block(int fd, size_t max_size)
> > +{
> > +    size_t write_size = MIN(MAX_BLOCKSIZE, max_size);
>
> Hm, well, there was a reason why I proposed rounding this down to the
> next power of two.  If max_size is not a power of two but below
> MAX_BLOCKSIZE, write_size will not be a power of two, and thus the write
> below may fail even if write_size exceeds the physical block size.
>
> You can see that in the test case you add by using e.g. 768 as the
> destination size (provided your test filesystem has a block size of 512).
>
> Now I would like to say that it’s stupid to resize an O_DIRECT file to a
> size that is not a multiple of the block size; but I’ve had a bug
> assigned to me before because that didn’t work.
>
> But maybe it’s actually better if it doesn’t work.  I don’t know.

I tried to avoid complexity that is unlikely to help anyone, but we
can make the (typical)
case of 512 bytes sector size work with this:

    size_t write_size = (max_size < MAX_BLOCKSIZE)
        ? BDRV_SECTOR_SIZE
        : MAX_BLOCKSIZE;

Unfortunately testing max_size < 4096 will not be reliable since we don't know
that underlying storage sector size.

...



reply via email to

[Prev in Thread] Current Thread [Next in Thread]