[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 21/52] migration/rdma: Fix QEMUFileHooks method return values
From: |
Zhijian Li (Fujitsu) |
Subject: |
Re: [PATCH 21/52] migration/rdma: Fix QEMUFileHooks method return values |
Date: |
Mon, 25 Sep 2023 07:03:30 +0000 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 |
On 25/09/2023 14:36, Markus Armbruster wrote:
> "Zhijian Li (Fujitsu)" <lizhijian@fujitsu.com> writes:
>
>> On 18/09/2023 22:41, Markus Armbruster wrote:
>>> The QEMUFileHooks methods don't come with a written contract. Digging
>>> through the code calling them, we find:
>>>
>>> * save_page():
>>
>> I'm fine with this
>>
>>>
>>> Negative values RAM_SAVE_CONTROL_DELAYED and
>>> RAM_SAVE_CONTROL_NOT_SUPP are special. Any other negative value is
>>> an unspecified error.
>>>
>>> qemu_rdma_save_page() returns -EIO or rdma->error_state on error. I
>>> believe the latter is always negative. Nothing stops either of them
>>> to clash with the special values, though. Feels unlikely, but fix
>>> it anyway to return only the special values and -1.
>>>
>>> * before_ram_iterate(), before_ram_iterate():
>>
>> error code returned by before_ram_iterate() and before_ram_iterate() will be
>> assigned to QEMUFile for upper layer.
>> But it seems that no callers take care about the error ? Shouldn't let the
>> callers
>> to check the error instead ?
>
> The error values returned by qemu_rdma_registration_start() and
> qemu_rdma_registration_stop() carry no additional information a caller
> could check.
I think qemu_file_get_error(f) can be used for callers to check the error code.
>
> Both return either -EIO or rdma->error_state on error. The latter is
> *not* a negative errno code. Evidence: qio_channel_rdma_shutdown()
> assigns -1, qemu_rdma_block_for_wrid() assigns the error value of
> qemu_rdma_poll(), which can be the error value of ibv_poll_cq(), which
> is an unspecified negative value, ...
>
you are right.
> I decided not to investigate what qemu-file.c does with the error values
> after one quick glance at the code. It's confusing, and quite possibly
> confused. But I'm already at 50+ patches, and am neither inclined nor
> able to take on more migration cleanup at this time.
Yeah, it's already big enough patch set.
very thanks
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
>
>>> Negative value means error. qemu_rdma_registration_start() and
>>> qemu_rdma_registration_stop() comply as far as I can tell. Make
>>> them comply *obviously*, by returning -1 on error.
>>>
>>> * hook_ram_load:
>>>
>>> Negative value means error. rdma_load_hook() already returns -1 on
>>> error. Leave it alone.
>>
>> see inline reply below,
>>
>>>
>>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>>> ---
>>> migration/rdma.c | 79 +++++++++++++++++++++++-------------------------
>>> 1 file changed, 37 insertions(+), 42 deletions(-)
>>>
>>> diff --git a/migration/rdma.c b/migration/rdma.c
>>> index cc59155a50..46b5859268 100644
>>> --- a/migration/rdma.c
>>> +++ b/migration/rdma.c
>>> @@ -3219,12 +3219,11 @@ static size_t qemu_rdma_save_page(QEMUFile *f,
>>> rdma = qatomic_rcu_read(&rioc->rdmaout);
>>>
>>> if (!rdma) {
>>> - return -EIO;
>>> + return -1;
>>> }
>>>
>>> - ret = check_error_state(rdma);
>>> - if (ret) {
>>> - return ret;
>>> + if (check_error_state(rdma)) {
>>> + return -1;
>>> }
>>>
>>> qemu_fflush(f);
>>> @@ -3290,9 +3289,10 @@ static size_t qemu_rdma_save_page(QEMUFile *f,
>>> }
>>>
>>> return RAM_SAVE_CONTROL_DELAYED;
>>> +
>>> err:
>>> rdma->error_state = ret;
>>> - return ret;
>>> + return -1;
>>> }
>>>
>>> static void rdma_accept_incoming_migration(void *opaque);
>>> @@ -3538,12 +3538,11 @@ static int qemu_rdma_registration_handle(QEMUFile
>>> *f)
>>> rdma = qatomic_rcu_read(&rioc->rdmain);
>>>
>>> if (!rdma) {
>>> - return -EIO;
>>> + return -1;
>>
>> that's because EIO is not accurate here ?
>
> It's because the function does not return a negative errno code, and
> returning -EIO is misleading readers into assuming it does
>
>>> }
>>>
>>> - ret = check_error_state(rdma);
>>> - if (ret) {
>>> - return ret;
>>
>> Ditto
>
> Likewise.
>
> [...]
>
- Re: [PATCH 41/52] migration/rdma: Convert qemu_rdma_post_send_control() to Error, (continued)
- [PATCH 16/52] migration/rdma: Fix io_writev(), io_readv() methods to obey contract, Markus Armbruster, 2023/09/18
- [PATCH 22/52] migration/rdma: Fix rdma_getaddrinfo() error checking, Markus Armbruster, 2023/09/18
- [PATCH 45/52] migration/rdma: Silence qemu_rdma_connect(), Markus Armbruster, 2023/09/18
- [PATCH 48/52] migration/rdma: Silence qemu_rdma_block_for_wrid(), Markus Armbruster, 2023/09/18
- [PATCH 49/52] migration/rdma: Silence qemu_rdma_register_and_get_keys(), Markus Armbruster, 2023/09/18