qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 21/52] migration/rdma: Fix QEMUFileHooks method return values


From: Zhijian Li (Fujitsu)
Subject: Re: [PATCH 21/52] migration/rdma: Fix QEMUFileHooks method return values
Date: Mon, 25 Sep 2023 07:03:30 +0000
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0


On 25/09/2023 14:36, Markus Armbruster wrote:
> "Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com> writes:
> 
>> On 18/09/2023 22:41, Markus Armbruster wrote:
>>> The QEMUFileHooks methods don't come with a written contract.  Digging
>>> through the code calling them, we find:
>>>
>>> * save_page():
>>
>> I'm fine with this
>>
>>>
>>>     Negative values RAM_SAVE_CONTROL_DELAYED and
>>>     RAM_SAVE_CONTROL_NOT_SUPP are special.  Any other negative value is
>>>     an unspecified error.
>>>
>>>     qemu_rdma_save_page() returns -EIO or rdma->error_state on error.  I
>>>     believe the latter is always negative.  Nothing stops either of them
>>>     to clash with the special values, though.  Feels unlikely, but fix
>>>     it anyway to return only the special values and -1.
>>>
>>> * before_ram_iterate(), before_ram_iterate():
>>
>> error code returned by before_ram_iterate() and before_ram_iterate() will be
>> assigned to QEMUFile for upper layer.
>> But it seems that no callers take care about the error ?  Shouldn't let the 
>> callers
>> to check the error instead ?
> 
> The error values returned by qemu_rdma_registration_start() and
> qemu_rdma_registration_stop() carry no additional information a caller
> could check.


I think qemu_file_get_error(f) can be used for callers to check the error code.



> 
> Both return either -EIO or rdma->error_state on error.  The latter is
> *not* a negative errno code.  Evidence: qio_channel_rdma_shutdown()
> assigns -1, qemu_rdma_block_for_wrid() assigns the error value of
> qemu_rdma_poll(), which can be the error value of ibv_poll_cq(), which
> is an unspecified negative value, ...
> 
you are right.


> I decided not to investigate what qemu-file.c does with the error values
> after one quick glance at the code.  It's confusing, and quite possibly
> confused.  But I'm already at 50+ patches, and am neither inclined nor
> able to take on more migration cleanup at this time.

Yeah, it's already big enough patch set.

very thanks

Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>


> 
>>>     Negative value means error.  qemu_rdma_registration_start() and
>>>     qemu_rdma_registration_stop() comply as far as I can tell.  Make
>>>     them comply *obviously*, by returning -1 on error.
>>>
>>> * hook_ram_load:
>>>
>>>     Negative value means error.  rdma_load_hook() already returns -1 on
>>>     error.  Leave it alone.
>>
>> see inline reply below,
>>
>>>
>>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>>> ---
>>>    migration/rdma.c | 79 +++++++++++++++++++++++-------------------------
>>>    1 file changed, 37 insertions(+), 42 deletions(-)
>>>
>>> diff --git a/migration/rdma.c b/migration/rdma.c
>>> index cc59155a50..46b5859268 100644
>>> --- a/migration/rdma.c
>>> +++ b/migration/rdma.c
>>> @@ -3219,12 +3219,11 @@ static size_t qemu_rdma_save_page(QEMUFile *f,
>>>        rdma = qatomic_rcu_read(&rioc->rdmaout);
>>>    
>>>        if (!rdma) {
>>> -        return -EIO;
>>> +        return -1;
>>>        }
>>>    
>>> -    ret = check_error_state(rdma);
>>> -    if (ret) {
>>> -        return ret;
>>> +    if (check_error_state(rdma)) {
>>> +        return -1;
>>>        }
>>>    
>>>        qemu_fflush(f);
>>> @@ -3290,9 +3289,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f,
>>>        }
>>>    
>>>        return RAM_SAVE_CONTROL_DELAYED;
>>> +
>>>    err:
>>>        rdma->error_state = ret;
>>> -    return ret;
>>> +    return -1;
>>>    }
>>>    
>>>    static void rdma_accept_incoming_migration(void *opaque);
>>> @@ -3538,12 +3538,11 @@ static int qemu_rdma_registration_handle(QEMUFile 
>>> *f)
>>>        rdma = qatomic_rcu_read(&rioc->rdmain);
>>>    
>>>        if (!rdma) {
>>> -        return -EIO;
>>> +        return -1;
>>
>> that's because EIO is not accurate here ?
> 
> It's because the function does not return a negative errno code, and
> returning -EIO is misleading readers into assuming it does
> 
>>>        }
>>>    
>>> -    ret = check_error_state(rdma);
>>> -    if (ret) {
>>> -        return ret;
>>
>> Ditto
> 
> Likewise.
> 
> [...]
> 

reply via email to

[Prev in Thread] Current Thread [Next in Thread]