qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH] vfio/migration: Fix return value of vfio_migration_realize()


From: Duan, Zhenzhong
Subject: RE: [PATCH] vfio/migration: Fix return value of vfio_migration_realize()
Date: Fri, 16 Jun 2023 09:53:38 +0000


>-----Original Message-----
>From: Joao Martins <joao.m.martins@oracle.com>
>Sent: Friday, June 16, 2023 5:06 PM
>To: Duan, Zhenzhong <zhenzhong.duan@intel.com>
>Cc: alex.williamson@redhat.com; clg@redhat.com; qemu-devel@nongnu.org;
>Peng, Chao P <chao.p.peng@intel.com>
>Subject: Re: [PATCH] vfio/migration: Fix return value of 
>vfio_migration_realize()
>
>On 16/06/2023 03:42, Duan, Zhenzhong wrote:
>>> -----Original Message-----
>>> From: Joao Martins <joao.m.martins@oracle.com>
>>> Sent: Thursday, June 15, 2023 6:23 PM
>>> To: Duan, Zhenzhong <zhenzhong.duan@intel.com>
>>> Cc: alex.williamson@redhat.com; clg@redhat.com;
>>> qemu-devel@nongnu.org; Peng, Chao P <chao.p.peng@intel.com>
>>> Subject: Re: [PATCH] vfio/migration: Fix return value of
>>> vfio_migration_realize()
>>>
>>> On 15/06/2023 10:19, Duan, Zhenzhong wrote:
>>>>> -----Original Message-----
>>>>> From: Joao Martins <joao.m.martins@oracle.com>
>>>>> Sent: Thursday, June 15, 2023 4:54 PM
>>>>> To: Duan, Zhenzhong <zhenzhong.duan@intel.com>
>>>>> Cc: alex.williamson@redhat.com; clg@redhat.com;
>>>>> qemu-devel@nongnu.org; Peng, Chao P <chao.p.peng@intel.com>
>>>>> Subject: Re: [PATCH] vfio/migration: Fix return value of
>>>>> vfio_migration_realize()
>>>>>
>>>>>
>>>>>
>>>>> On 15/06/2023 09:18, Zhenzhong Duan wrote:
>>>>>> We should print "Migration disabled" when migration is blocked in
>>>>>> vfio_migration_realize().
>>>>>>
>>>>>> Fix it by reverting return value of migrate_add_blocker(),
>>>>>> meanwhile error out directly once migrate_add_blocker() failed.
>>>>>>
>>>>>
>>>>> It wasn't immediately obvious from commit message that this is
>>>>> mainly about just printing an error message when blockers get added
>>>>> and that well
>>>>> migrate_add_blocker() returns 0 on success and caller of
>>>>> vfio_migration_realize expects the opposite when blockers are added.
>>>>>
>>>>> Perhaps better wording would be:
>>>>>
>>>>> migrate_add_blocker() returns 0 when successfully adding the
>>>>> migration blocker. However, the caller of vfio_migration_realize()
>>>>> considers that migration was blocked when the latter returned an
>>>>> error. Fix it by negating the return value obtained by
>>>>> migrate_add_blocker(). What matters for migration is that the
>>>>> blocker is added in core migration, so this cleans up usability
>>>>> such that user sees "Migrate disabled" when any of the vfio
>>>>> migration blockers are
>>> active.
>>>>
>>>> Great, I'll use your words.
>>>>
>>>>>
>>>>>> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
>>>>>> ---
>>>>>>  hw/vfio/common.c    | 4 ++--
>>>>>>  hw/vfio/migration.c | 6 +++---
>>>>>>  2 files changed, 5 insertions(+), 5 deletions(-)
>>>>>>
>>>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c index
>>>>>> fa8fd949b1cf..8505385798f3 100644
>>>>>> --- a/hw/vfio/common.c
>>>>>> +++ b/hw/vfio/common.c
>>>>>> @@ -399,7 +399,7 @@ int
>>>>>> vfio_block_multiple_devices_migration(Error
>>>>> **errp)
>>>>>>          multiple_devices_migration_blocker = NULL;
>>>>>>      }
>>>>>>
>>>>>> -    return ret;
>>>>>> +    return !ret;
>>>>>>  }
>>>>>>
>>>>>>  void vfio_unblock_multiple_devices_migration(void)
>>>>>> @@ -444,7 +444,7 @@ int vfio_block_giommu_migration(Error **errp)
>>>>>>          giommu_migration_blocker = NULL;
>>>>>>      }
>>>>>>
>>>>>> -    return ret;
>>>>>> +    return !ret;
>>>>>>  }
>>>>>>
>>>>>>  void vfio_migration_finalize(void) diff --git
>>>>>> a/hw/vfio/migration.c b/hw/vfio/migration.c index
>>>>>> 6b58dddb8859..0146521d129a 100644
>>>>>> --- a/hw/vfio/migration.c
>>>>>> +++ b/hw/vfio/migration.c
>>>>>> @@ -646,12 +646,12 @@ int vfio_migration_realize(VFIODevice
>>>>>> *vbasedev,
>>>>> Error **errp)
>>>>>>      }
>>>>>>
>>>>>>      ret = vfio_block_multiple_devices_migration(errp);
>>>>>> -    if (ret) {
>>>>>> +    if (ret || (errp && *errp)) {
>>>>>
>>>>> Why do you need this extra clause?
>>>>
>>>> Now that error happens, no need to add other blockers which will
>>>> fail for same reason.
>>>>
>>>
>>> But you don't need the (errp && *errp) for that as that's the whole
>>> point of negating the result.
>>>
>>> And if there's an error set it means migrate_add_blocker failed to
>>> add the blocker (e.g. snapshotting IIUC), and you would be failing here
>unnecessarily?
>>
>> If there is an error qdev_device_add() will fail, continue execution is
>meaningless here?
>> There is ERRP_GUARD in this path, so it looks (*errp) is enough.
>>
>> If I removed (errp && *errp) to continue, need below change to bypass
>> trace_vfio_migration_probe Do you prefer this way?
>>
>>     if (!*errp) {
>>         trace_vfio_migration_probe(vbasedev->name);
>>     }
>>
>
>I am mainly questioning that the error testing is correct to test here.
>
>IIUC, the only one that can propagate any *new* error in
>vfio_migration_realize is the calls to migrate_add_blocker failing within the
>vfio_block* (migration code suggests that this happens on snapshotting).
>Failing to add migration blocker just means we haven't installed any blockers.
>And the current code presents that as a "Migration disabled" case. If we want
>to preserve that behaviour on migration_add_blocker() failures (which seems
>like that's what you are doing here) then instead of this:

Current behavior(without my patch):
"Migration disabled" isn't printed if migrate_add_blocker succeed.
"Migration disabled" is printed if migrate_add_blocker fail.

I think this behavior isn't correct, I want to revert it not preserve it, so I 
used !ret.
Imagine we hotplug a vfio device when snapshotting, migrate_add_blocker failure
will lead to hotplug fail, then the guest is still migratable as no vfio 
plugged.
But we see "Migration disabled" which will confuse us.

>
>       return !ret;
>
>you would do this:
>
>       ret = migration_add_blocker(...);
>       if (ret < 0) {
>               error_free(...);
>               blocker = NULL;
>+              return ret;
>       }
>
>+      return 1;
>
>Or something like that. And then if you return early as you intended?

Yes, this change make sense, I also want to add below:

     if (!pdev->failover_pair_id) {
         ret = vfio_migration_realize(vbasedev, errp);
-        if (ret) {
+        if (ret > 0) {
             error_report("%s: Migration disabled", vbasedev->name);
         }

Thanks
Zhenzhong

reply via email to

[Prev in Thread] Current Thread [Next in Thread]