[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [RFC PATCH 1/3] migration/multifd: Move channels_ready semaphore
From: |
Elena Ufimtseva |
Subject: |
Re: [RFC PATCH 1/3] migration/multifd: Move channels_ready semaphore |
Date: |
Fri, 22 Sep 2023 15:33:10 -0700 |
On Fri, Sep 22, 2023 at 11:53:17AM -0300, Fabiano Rosas wrote:
> Commit d2026ee117 ("multifd: Fix the number of channels ready") moved
> the "post" of channels_ready to the start of the multifd_send_thread()
> loop and added a missing "wait" at multifd_send_sync_main(). While it
> does work, the placement of the wait goes against what the rest of the
> code does.
>
> The sequence at multifd_send_thread() is:
>
> qemu_sem_post(&multifd_send_state->channels_ready);
> qemu_sem_wait(&p->sem);
> <work>
> if (flags & MULTIFD_FLAG_SYNC) {
> qemu_sem_post(&p->sem_sync);
> }
>
> Which means that the sending thread makes itself available
> (channels_ready) and waits for more work (sem). So the sequence in the
> migration thread should be to check if any channel is available
> (channels_ready), give it some work and set it off (sem):
>
> qemu_sem_wait(&multifd_send_state->channels_ready);
> <enqueue work>
> qemu_sem_post(&p->sem);
> if (flags & MULTIFD_FLAG_SYNC) {
> qemu_sem_wait(&p->sem_sync);
> }
>
> The reason there's no deadlock today is that the migration thread
> enqueues the SYNC packet right before the wait on channels_ready and
> we end up taking advantage of the out-of-order post to sem:
>
> ...
> qemu_sem_post(&p->sem);
> }
> for (i = 0; i < migrate_multifd_channels(); i++) {
> MultiFDSendParams *p = &multifd_send_state->params[i];
>
> qemu_sem_wait(&multifd_send_state->channels_ready);
> trace_multifd_send_sync_main_wait(p->id);
> qemu_sem_wait(&p->sem_sync);
> ...
>
> Move the channels_ready wait before the sem post to keep the sequence
> consistent. Also fix the error path to post to channels_ready and
> sem_sync in the correct order.
>
Thank you Fabiano,
Your solution is more complete. I also had in mind getting rid of
sem_sync.
With your second patch, this one could be merged with it?
> Signed-off-by: Fabiano Rosas <farosas@suse.de>
> ---
> migration/multifd.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/migration/multifd.c b/migration/multifd.c
> index a7c7a947e3..d626740f2f 100644
> --- a/migration/multifd.c
> +++ b/migration/multifd.c
> @@ -618,6 +618,7 @@ int multifd_send_sync_main(QEMUFile *f)
>
> trace_multifd_send_sync_main_signal(p->id);
>
> + qemu_sem_wait(&multifd_send_state->channels_ready);
> qemu_mutex_lock(&p->mutex);
>
> if (p->quit) {
> @@ -635,7 +636,6 @@ int multifd_send_sync_main(QEMUFile *f)
> for (i = 0; i < migrate_multifd_channels(); i++) {
> MultiFDSendParams *p = &multifd_send_state->params[i];
>
> - qemu_sem_wait(&multifd_send_state->channels_ready);
> trace_multifd_send_sync_main_wait(p->id);
> qemu_sem_wait(&p->sem_sync);
>
> @@ -763,8 +763,8 @@ out:
> * who pay attention to me.
> */
> if (ret != 0) {
> - qemu_sem_post(&p->sem_sync);
> qemu_sem_post(&multifd_send_state->channels_ready);
> + qemu_sem_post(&p->sem_sync);
Can this thread in this error case be woken up again between
these two qemu_sem_posts?
I see in other places p->quit is set to true before it.
Or maybe it should one more patch to make these consistent
as well.
Elena U.
> }
>
> qemu_mutex_lock(&p->mutex);
> --
> 2.35.3
>