[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-block] [PATCH v2 2/4] block: Pause block jobs in bdrv_drain_al
From: |
Stefan Hajnoczi |
Subject: |
Re: [Qemu-block] [PATCH v2 2/4] block: Pause block jobs in bdrv_drain_all |
Date: |
Thu, 9 Apr 2015 11:34:04 +0100 |
User-agent: |
Mutt/1.5.23 (2014-03-12) |
On Wed, Apr 08, 2015 at 04:56:14PM +0200, Alberto Garcia wrote:
> On Wed, Apr 08, 2015 at 11:37:52AM +0100, Stefan Hajnoczi wrote:
>
> > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> > > + AioContext *aio_context = bdrv_get_aio_context(bs);
> > > +
> > > + aio_context_acquire(aio_context);
> > > + if (bs->job) {
> > > + block_job_pause(bs->job);
> > > + }
> > > + aio_context_release(aio_context);
> > > + }
> > > +
> > > while (busy) {
> > > busy = false;
> > >
> > > @@ -2044,6 +2054,16 @@ void bdrv_drain_all(void)
> > > aio_context_release(aio_context);
> > > }
> > > }
> > > +
> > > + QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
> > > + AioContext *aio_context = bdrv_get_aio_context(bs);
> > > +
> > > + aio_context_acquire(aio_context);
> > > + if (bs->job) {
> > > + block_job_resume(bs->job);
> > > + }
> > > + aio_context_release(aio_context);
> > > + }
> > > }
> >
> > There is a tiny chance that we pause a job (which actually just sets
> > job->paused = true but there's no guarantee the job's coroutine
> > reacts to this) right before it terminates. Then aio_poll() enters
> > the coroutine one last time and the job terminates.
> >
> > We then reach the resume portion of bdrv_drain_all() and the job no
> > longer exists. Hopefully nothing started a new job in the meantime.
> > bs->job should either be the original block job or NULL.
> >
> > This code seems under current assumptions, but I just wanted to
> > raise these issues in case someone sees problems that I've missed.
>
> Is it possible that a new job is started in the meantime? If that's
> the case this will hit the assertion in block_job_resume().
That is currently not possible since the QEMU monitor does not run while
we're waiting in aio_poll().
Therefore no block job monitor commands could spawn a new job.
If code is added that spawns a job based on an AioContext timer or due
to some other event, then this assumption no longer holds and there is a
problem because block_job_resume() is called on a job that never paused.
But for now there is no problem.
Stefan
pgpe4zTMtbuCR.pgp
Description: PGP signature