qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH] Fix Guest VM crash due to iSCSI Sense Key error


From: John Snow
Subject: Re: [Qemu-devel] [PATCH] Fix Guest VM crash due to iSCSI Sense Key error
Date: Fri, 26 Jul 2019 16:18:46 -0400
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

Paolo, Stefan and Kevin: can I loop you in here? I'm quite uncertain
about this and I'd like to clear this up quickly if it's possible:

On 7/25/19 8:58 PM, John Snow wrote:
> 
> 
> On 7/7/19 10:55 PM, address@hidden wrote:
>> From: Shaju Abraham <address@hidden>
>>
>> During the  IDE DMA transfer for a ISCSI target,when libiscsi encounters
>> a SENSE KEY error, it sets the task->sense to  the value "COMMAND ABORTED".
>> The function iscsi_translate_sense() later translaters this error to 
>> -ECANCELED
>> and this value is passed to the callback function. In the case of  IDE DMA 
>> read
>> or write, the callback function returns immediately if the value of the ret
>> argument is -ECANCELED.
>> Later when ide_cancel_dma_sync() function is invoked  the assertion
>> "s->bus->dma->aiocb == ((void *)0)" fails and the qemu process gets 
>> terminated.
>> Fix the issue by making the value of s->bus->dma->aiocb = NULL when
>> -ECANCELED is passed to the callback.
>>
>> Signed-off-by: Shaju Abraham <address@hidden>
>> ---
>>  hw/ide/core.c | 1 +
>>  1 file changed, 1 insertion(+)
>>
>> diff --git a/hw/ide/core.c b/hw/ide/core.c
>> index 6afadf8..78ea357 100644
>> --- a/hw/ide/core.c
>> +++ b/hw/ide/core.c
>> @@ -841,6 +841,7 @@ static void ide_dma_cb(void *opaque, int ret)
>>      bool stay_active = false;
>>  
>>      if (ret == -ECANCELED) {
>> +        s->bus->dma->aiocb = NULL;
>>          return;
>>      }
>>  
>>
> 
> The part that makes me nervous here is that I can't remember why we do
> NO cleanup whatsoever for the ECANCELED case.
> 
> commit 0d910cfeaf2076b116b4517166d5deb0fea76394
> Author: Fam Zheng <address@hidden>
> Date:   Thu Sep 11 13:41:07 2014 +0800
> 
>     ide/ahci: Check for -ECANCELED in aio callbacks
> 
> 
> ... This looks like we never expected the aio callbacks to ever get
> called with ECANCELED, so we treat this as a QEMU-internal signal.
> 
> It looks like we expect these callbacks to do NOTHING in this case; but
> I'm not sure where the IDE state machine does its cleanup otherwise.
> (The DMA might have been canceled, but the DMA and IDE state machines
> still need to exit their loop.)
> 
> If you take a look at this patch from 2014 though, there are many other
> spots where we have littered ECANCELED checks that might also cause
> problems if we're receiving error codes we thought we couldn't get normally.
> 
> I am worried this patch papers over something worse.
> 
I'm not clear why Fam's patch adds a do-nothing return to the ide_dma_cb
if it's invoked with ECANCELED: shouldn't it be the case that the IDE
state machine needs to know that a transfer it was relying on to service
an ATA command was canceled and treat it like an error?

Why was it ever correct to ignore these? Is it because we only ever
canceled DMA during reset/shutdown/etc?

It appears as if iscsi requests can actually genuinely return an
ECANCELED errno, so there are likely several places in the IDE code that
need to accommodate this from happening.

The easiest fix LOOKS like just deleting the special-casing of ECANCELED
altogether and letting the error pathways handle things as normal.

Am I mistaken?

--js



reply via email to

[Prev in Thread] Current Thread [Next in Thread]