qemu-block
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: iotest 030 SIGSEGV


From: Paolo Bonzini
Subject: Re: iotest 030 SIGSEGV
Date: Fri, 15 Oct 2021 11:38:22 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.1.0

On 14/10/21 18:14, Vladimir Sementsov-Ogievskiy wrote:

iotest 30 failing is a long story.. And as I understand the main source of all these crashes is that we do diffreent graph modifications simultaneously from parallel block jobs.

In past I sent RFC series with global mutext, to fix a subset of the problem: https://patchew.org/QEMU/20201120161622.1537-1-vsementsov@virtuozzo.com/ [just look at patch 5: https://patchew.org/QEMU/20201120161622.1537-1-vsementsov@virtuozzo.com/20201120161622.1537-6-vsementsov@virtuozzo.com/]

Can you explain the way they interleave, and where the job callbacks are yielding in the middle of graph manipulations?

The problem with a CoMutex is that plenty of graph manipulations happen outside coroutines, and if coroutines such as stream_co_clean yield the monitor can do graph manipulations of its own.

So if the solution could be "no yielding in the middle of graph manipulations", that would be much better. In fact, maybe the coroutine API should have a way to assert "no-yield regions" (similar to how Linux croaks if you call a sleeping function while preemption is disabled). More assertions = more bugs found early.

Not sure was it good enough to try to recover it. I didn't look close at Emanuele's "block layer: split block APIs in global state and I/O". Wasn't there something on protecting graph operations?

In his series, graph operations are supposed to operate from the main thread (which they do) but he didn't cover the case of coroutines that yield.

Paolo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]