qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 5/5] multifd: Only sync once each full round of memory


From: Juan Quintela
Subject: Re: [PATCH 5/5] multifd: Only sync once each full round of memory
Date: Mon, 04 Jul 2022 18:18:53 +0200
User-agent: Gnus/5.13 (Gnus v5.13) Emacs/27.2 (gnu/linux)

Leonardo Brás <leobras@redhat.com> wrote:
> Hello Juan,
>
> On Tue, 2022-06-21 at 16:05 +0200, Juan Quintela wrote:
>> We need to add a new flag to mean to sync at that point.
>> Notice that we still synchronize at the end of setup and at the end of
>> complete stages.
>> 
>> Signed-off-by: Juan Quintela <quintela@redhat.com>
>> ---
>>  migration/migration.c |  2 +-
>>  migration/ram.c       | 42 ++++++++++++++++++++++++++++++------------
>>  2 files changed, 31 insertions(+), 13 deletions(-)
>> 




>> @@ -2943,11 +2953,13 @@ static int ram_save_setup(QEMUFile *f, void *opaque)
>>      ram_control_before_iterate(f, RAM_CONTROL_SETUP);
>>      ram_control_after_iterate(f, RAM_CONTROL_SETUP);
>>  
>> -    if (migrate_multifd_sync_each_iteration()) {
>> -        ret =  multifd_send_sync_main(f);
>> -        if (ret < 0) {
>> -            return ret;
>> -        }
>
> (1) IIUC, the above lines were changed in 2/5 to be reverted now.
> Is that correct? was it expected?

Yeap.  The problem here is that (withouth this patchset) we synchrconize
in three places:

- ram_save_setup()
- ram_save_iterate()
- ram_save_complete()

And we want to change it to:

- ram_save_setup()
- ram_save_complete()
- And each time that we pass through the end of memory. (much less times
  than calls to ram_save_iterate).

In the three places we send:

   RAM_SAVE_FLAG_EOS

And that is what cause the synchronization.

As we can't piggyback on RAM_SAVE_FLAG_EOS anymore, I added a new flag
to synchronize it.

The problem is that now (on setup and complete)  we need to synchronize
independently if we do the older way or the new one.  On iterate we only
synchronize on the old code, and on new code only when we reach the end
of the memory.

I *thought* it was clear this way, but I can do without the change if
people think it is easier.


>> +    ret =  multifd_send_sync_main(f);
>> +    if (ret < 0) {
>> +        return ret;
>> +    }
>> +
>> +    if (!migrate_multifd_sync_each_iteration()) {
>> +        qemu_put_be64(f, RAM_SAVE_FLAG_MULTIFD_SYNC);
>
> (2) I have done some testing with this patchset (because of MSG_ZEROCOPY) and 
> it
> seems this part here is breaking migration from this build to 'older' builds
> (same commits except for this patchset):
>
> qemu-system-x86_64: Unknown combination of migration flags: 0x200
> qemu-system-x86_64: error while loading state section id 2(ram)
> qemu-system-x86_64: load of migration failed: Invalid argument

You can't do that O:-) (TM), that is the whole point that I added the
multifd-sync-each-iteration property.  It is true for old machine types,
it is false for new machine types.  If you try to play with that
property, it is better than you know what you are doing (TM).

> Which makes sense, since there is no RAM_SAVE_FLAG_MULTIFD_SYNC in older
> versions. Is this expected / desired ?

See previous paragraph.

> Strange enough, it seems to be breaking even with this set in the sending 
> part: 
> --global migration.multifd-sync-each-iteration=on
>
> Was the idea of this config to allow migration to older qemu builds?

If you set it to on, it should work against and old qemu.  By default it
is set to on for old machine types, and only to on for new machine
types.  So you should have it right if you use new machine types.  (If
you use "pc" or "q35" machine types, you should now what you are doing
for migrating machines.  We do this kind of change all the time).

>>      }
>>      qemu_put_be64(f, RAM_SAVE_FLAG_EOS);
>>      qemu_fflush(f);
>> @@ -3127,13 +3139,14 @@ static int ram_save_complete(QEMUFile *f, void
>> *opaque)
>>          return ret;
>>      }
>>  
>> -    if (migrate_multifd_sync_each_iteration()) {
>> -        ret = multifd_send_sync_main(rs->f);
>> -        if (ret < 0) {
>> -            return ret;
>> -        }
>> +    ret = multifd_send_sync_main(rs->f);
>> +    if (ret < 0) {
>> +        return ret;
>>      }
>
> (3) Same as (1)

Yeap, this is on purpose.  If you feel that it is clearer the other way,
I can change the patchset.

Later, Juan.




reply via email to

[Prev in Thread] Current Thread [Next in Thread]