qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] block/blklogwrites: Fix a bug when logging "write zeroes" op


From: Kevin Wolf
Subject: Re: [PATCH] block/blklogwrites: Fix a bug when logging "write zeroes" operations.
Date: Wed, 10 Jan 2024 18:21:07 +0100

Am 10.01.2024 um 16:21 hat Ari Sundholm geschrieben:
> On 1/10/24 15:39, Kevin Wolf wrote:
> > Am 09.01.2024 um 19:46 hat megari@gmx.com geschrieben:
> > > From: Ari Sundholm <ari@tuxera.com>
> > > 
> > > There is a bug in the blklogwrites driver pertaining to logging "write
> > > zeroes" operations, causing log corruption. This can be easily observed
> > > by setting detect-zeroes to something other than "off" for the driver.
> > > 
> > > The issue is caused by a concurrency bug pertaining to the fact that
> > > "write zeroes" operations have to be logged in two parts: first the log
> > > entry metadata, then the zeroed-out region. While the log entry
> > > metadata is being written by bdrv_co_pwritev(), another operation may
> > > begin in the meanwhile and modify the state of the blklogwrites driver.
> > > This is as intended by the coroutine-driven I/O model in QEMU, of
> > > course.
> > > 
> > > Unfortunately, this specific scenario is mishandled. A short example:
> > >      1. Initially, in the current operation (#1), the current log sector
> > > number in the driver state is only incremented by the number of sectors
> > > taken by the log entry metadata, after which the log entry metadata is
> > > written. The current operation yields.
> > >      2. Another operation (#2) may start while the log entry metadata is
> > > being written. It uses the current log position as the start offset for
> > > its log entry. This is in the sector right after the operation #1 log
> > > entry metadata, which is bad!
> > >      3. After bdrv_co_pwritev() returns (#1), the current log sector
> > > number is reread from the driver state in order to find out the start
> > > offset for bdrv_co_pwrite_zeroes(). This is an obvious blunder, as the
> > > offset will be the sector right after the (misplaced) operation #2 log
> > > entry, which means that the zeroed-out region begins at the wrong
> > > offset.
> > >      4. As a result of the above, the log is corrupt.
> > > 
> > > Fix this by only reading the driver metadata once, computing the
> > > offsets and sizes in one go (including the optional zeroed-out region)
> > > and setting the log sector number to the appropriate value for the next
> > > operation in line.
> > > 
> > > Signed-off-by: Ari Sundholm <ari@tuxera.com>
> > > Cc: qemu-stable@nongnu.org
> > 
> > Thanks, applied to the block branch.
> > 
> 
> Thank you.
> 
> > Note that while this fixes the single threaded case, it is not thread
> > safe and will still break with multiqueue enabled (see the new
> > iothread-vq-mapping option added to virtio-blk very recently). Most
> > block drivers already take a lock when modifying their global state, and
> > it looks like we missed blklogwrites when checking what needs to be
> > prepared for the block layer to be thread safe.
> > 
> 
> I see. Thanks for the heads up. I had missed this new development.
> 
> > So I think we'll want another patch to add a QemuMutex that can be taken
> > while you do the calculations on s->cur_log_sector. But this patch is
> > a good first step because it means that we don't need to keep a lock
> > across an I/O request (just for the sake of completeness, this would
> > have had to be a CoMutex rather than a QemuMutex).
> 
> OK. I guess I have a bit of additional work to do, then.

Yes, a bit, but I honestly don't expect the locking to be a big effort
for a small driver like blklogwrites.

> What release would these fixes realistically make it to? Just trying
> to gauge the urgency for the fix for the multi-threaded case for
> prioritization purposes.

Well, this one is queued for 9.0. If we can make sure to get the thread
safety fix into 9.0, too (I expect the soft freeze to be somewhere
around beginning of March and the release in April), that would be good,
because that would also be the first release to be affected.

I'm not sure if and when a 8.2.* stable release is planned before that.

> And yes, holding a CoMutex while doing I/O would have fixed this
> issue, with the tiny drawback of killing any concurrency in the
> driver. ;)

Ah, that's not true, only any concurrency in the log writes. ;)

Kevin




reply via email to

[Prev in Thread] Current Thread [Next in Thread]