qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH V1 2/3] migration: fix suspended runstate


From: Peter Xu
Subject: Re: [PATCH V1 2/3] migration: fix suspended runstate
Date: Tue, 20 Jun 2023 17:46:47 -0400

On Thu, Jun 15, 2023 at 01:26:39PM -0700, Steve Sistare wrote:
> Migration of a guest in the suspended state is broken.  The incoming
> migration code automatically tries to wake the guest, which IMO is
> wrong -- the guest should end migration in the same state it started.
> Further, the wakeup is done by calling qemu_system_wakeup_request(), which
> bypasses vm_start().  The guest appears to be in the running state, but
> it is not.
> 
> To fix, leave the guest in the suspended state, but call
> qemu_system_start_on_wakeup_request() so the guest is properly resumed
> later, when the client sends a system_wakeup command.
> 
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> ---
>  migration/migration.c | 11 ++++-------
>  softmmu/runstate.c    |  1 +
>  2 files changed, 5 insertions(+), 7 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 17b4b47..851fe6d 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -496,6 +496,10 @@ static void process_incoming_migration_bh(void *opaque)
>          vm_start();
>      } else {
>          runstate_set(global_state_get_runstate());
> +        if (runstate_check(RUN_STATE_SUSPENDED)) {
> +            /* Force vm_start to be called later. */
> +            qemu_system_start_on_wakeup_request();
> +        }

Is this really needed, along with patch 1?

I have a very limited knowledge on suspension, so I'm prone to making
mistakes..

But from what I read this, qemu_system_wakeup_request() (existing one, not
after patch 1 applied) will setup wakeup_reason and kick the main thread
using qemu_notify_event().  Then IIUC the e.g. vcpu wakeups will be done in
the main thread later on after qemu_wakeup_requested() returns true.

>      }
>      /*
>       * This must happen after any state changes since as soon as an external
> @@ -2101,7 +2105,6 @@ static int postcopy_start(MigrationState *ms)
>      qemu_mutex_lock_iothread();
>      trace_postcopy_start_set_run();
>  
> -    qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>      global_state_store();
>      ret = vm_stop_force_state(RUN_STATE_FINISH_MIGRATE);
>      if (ret < 0) {
> @@ -2307,7 +2310,6 @@ static void migration_completion(MigrationState *s)
>      if (s->state == MIGRATION_STATUS_ACTIVE) {
>          qemu_mutex_lock_iothread();
>          s->downtime_start = qemu_clock_get_ms(QEMU_CLOCK_REALTIME);
> -        qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);
>  
>          s->vm_old_state = runstate_get();
>          global_state_store();
> @@ -3102,11 +3104,6 @@ static void *bg_migration_thread(void *opaque)
>  
>      qemu_mutex_lock_iothread();
>  
> -    /*
> -     * If VM is currently in suspended state, then, to make a valid runstate
> -     * transition in vm_stop_force_state() we need to wakeup it up.
> -     */
> -    qemu_system_wakeup_request(QEMU_WAKEUP_REASON_OTHER, NULL);

Removal of these three places seems reasonable to me, or we won't persist
the SUSPEND state.

Above comment was the major reason I used to have thought it was needed
(again, based on zero knowledge around this..), but perhaps it was just
wrong?  I would assume vm_stop_force_state() will still just work with
suepended, am I right?

>      s->vm_old_state = runstate_get();
>  
>      global_state_store();
> diff --git a/softmmu/runstate.c b/softmmu/runstate.c
> index e127b21..771896c 100644
> --- a/softmmu/runstate.c
> +++ b/softmmu/runstate.c
> @@ -159,6 +159,7 @@ static const RunStateTransition 
> runstate_transitions_def[] = {
>      { RUN_STATE_RUNNING, RUN_STATE_SUSPENDED },
>      { RUN_STATE_SUSPENDED, RUN_STATE_RUNNING },
>      { RUN_STATE_SUSPENDED, RUN_STATE_FINISH_MIGRATE },
> +    { RUN_STATE_SUSPENDED, RUN_STATE_PAUSED },
>      { RUN_STATE_SUSPENDED, RUN_STATE_PRELAUNCH },
>      { RUN_STATE_SUSPENDED, RUN_STATE_COLO},
>  
> -- 
> 1.8.3.1
> 

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]