[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v2] monitor/qmp: resume monitor when clearing its queue
From: |
Wolfgang Bumiller |
Subject: |
Re: [PATCH v2] monitor/qmp: resume monitor when clearing its queue |
Date: |
Fri, 15 Nov 2019 09:25:01 +0100 |
User-agent: |
NeoMutt/20180716 |
On Wed, Nov 13, 2019 at 05:45:57PM +0100, Markus Armbruster wrote:
> Wolfgang Bumiller <address@hidden> writes:
>
> > When a monitor's queue is filled up in handle_qmp_command()
> > it gets suspended. It's the dispatcher bh's job currently to
> > resume the monitor, which it does after processing an event
> > from the queue. However, it is possible for a
> > CHR_EVENT_CLOSED event to be processed before before the bh
> > is scheduled, which will clear the queue without resuming
> > the monitor, thereby preventing the dispatcher from reaching
> > the resume() call.
> > Any new connections to the qmp socket will be accept()ed and
> > show the greeting, but will not respond to any messages sent
> > afterwards (as they will not be read from the
> > still-suspended socket).
> > Fix this by resuming the monitor when clearing a queue which
> > was filled up.
> >
> > Signed-off-by: Wolfgang Bumiller <address@hidden>
> > ---
> > Changes to v1:
> > * Update commit message to include the resulting symptoms.
> > * Moved the resume code from `monitor_qmp_cleanup_req_queue_locked` to
> > `monitor_qmp_cleanup_queues` to avoid an unnecessary resume when
> > destroying the monitor (as the `_locked` version is also used by
> > `monitor_data_destroy()`.
> > * Renamed `monitor_qmp_cleanup_queues` to
> > `monitor_qmp_cleanup_queues_and_resume` to reflect the change and be
> > verbose about it for potential future users of the function.
> > Currently the only user is `monitor_qmp_event()` in the
> > `CHR_EVENT_CLOSED` case, which is exactly the problematic case
> > currently.
> >
> > Sorry for the deleay :|
>
> Same to you (my sorry excuse is KVM Forum). Now we need to hurry to get
> this fix into 4.2. Let's try.
Uh, that's already in Rcs. *hurries*
> > monitor/qmp.c | 24 ++++++++++++++++++++++--
> > 1 file changed, 22 insertions(+), 2 deletions(-)
> >
> > diff --git a/monitor/qmp.c b/monitor/qmp.c
> > index 9d9e5d8b27..df689aa95e 100644
> > --- a/monitor/qmp.c
> > +++ b/monitor/qmp.c
> > @@ -75,10 +75,30 @@ static void
> > monitor_qmp_cleanup_req_queue_locked(MonitorQMP *mon)
> > }
> > }
> >
> > -static void monitor_qmp_cleanup_queues(MonitorQMP *mon)
> > +static void monitor_qmp_cleanup_queues_and_resume(MonitorQMP *mon)
>
> Let's rename to _cleanup_queue_and resume(). The plural is a remnant
> from when we also had a response queue. Gone since commit 27656018d86.
>
> > {
> > qemu_mutex_lock(&mon->qmp_queue_lock);
> > +
> > + /*
> > + * Same condition as in monitor_qmp_bh_dispatcher(), but before
> > removing an
> > + * element from the queue (hence no `- 1`), also, the queue should not
> > be
> > + * empty either, otherwise the monitor hasn't been suspended yet (or
> > was
> > + * already resumed).
> > + */
>
> Comment lines should be wrapped around colum 70.
>
> > + bool need_resume = (!qmp_oob_enabled(mon) && mon->qmp_requests->length
> > > 0)
> > + || mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX;
>
> Now let me digest the comment. Here's condition in
> monitor_qmp_bh_dispatcher():
>
> need_resume = !qmp_oob_enabled(mon) ||
> mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX - 1;
>
> "Same but before removing" is
>
> bool need_resume = !qmp_oob_enabled(mon)
> || mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX;
>
> That leaves the "also" part. It's been too long since v1, I don't
> remember a thing, and I'm too dense today to understand without more
> help. Can you help me some more?
Not empty: `&& length > 0`. The 2nd part of the disjunction already
implies it (as it is `length == max`) so I only added it to the first
part. A pointless optimization on my part given that it's supposed to be
more easily verified against the comment, so I'll change it to be
clearer: ($condition_without_'-1') && (!g_queue_is_empty()), so:
bool need_resume = (!qmp_oob_enabled(mon) ||
mon->qmp_requests->length == QMP_REQ_QUEUE_LEN_MAX)
&& !g_queue_is_empty(mon->qmp_requests);
>
> > +
> > monitor_qmp_cleanup_req_queue_locked(mon);
> > +
> > + if (need_resume) {
> > + /*
> > + * Pairs with the monitor_suspend() in handle_qmp_command() in
> > case the
> > + * queue gets cleared from a CH_EVENT_CLOSED event before the
> > dispatch
> > + * bh got scheduled.
> > + */
>
> CHR_EVENT_CLOSED
>
> Suggest:
>
> /*
> * handle_qmp_command() suspened the monitor because the
> * request queue filled up, to be resumed when the queue has
> * space again. We just emptied it; resume the monitor.
> */
>
> If we want to record the issue that made us fix the missing resume, we
> can add:
>
> * Without this, the monitor would remain suspended forever
> * when we get here while the monitor is suspended. An
> * unfortunately times CHR_EVENT_CLOSED can do the trick.
>
> Also update handle_qmp_command()'s comment:
>
> /*
> * Suspend the monitor when we can't queue more requests after
> * this one. Dequeuing in monitor_qmp_bh_dispatcher() or
> * monitor_qmp_cleanup_queue_and_resume() will resume it.
> * Note that when OOB is disabled, we queue at most one command,
> * for backward compatibility.
> */
will do.