[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device s
From: |
Ilya Maximets |
Subject: |
Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop |
Date: |
Thu, 14 Dec 2017 17:31:20 +0300 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 |
One update for the testing scenario:
No need to kill OVS. The issue reproducible with simple 'del-port'
and 'add-port'. virtio driver in guest could crash on both operations.
Most times it crashes in my case on 'add-port' after deletion.
Hi Maxime,
I already saw below patches and original linux kernel virtio issue.
Just had no enough time to test them.
Now I tested below patches and they fixes virtio driver crash.
Thanks for suggestion.
Michael,
I tested "[PATCH] virtio_error: don't invoke status callbacks "
and it fixes the QEMU crash in case of broken guest index.
Thanks.
Best regards, Ilya Maximets.
P.S. Previously I mentioned that I can not reproduce virtio driver
crash with "[PATCH] virtio_error: don't invoke status callbacks"
applied. I was wrong. I can reproduce now. System was misconfigured.
Sorry.
On 14.12.2017 12:01, Maxime Coquelin wrote:
> Hi Ilya,
>
> On 12/14/2017 08:06 AM, Ilya Maximets wrote:
>> On 13.12.2017 22:48, Michael S. Tsirkin wrote:
>>> On Wed, Dec 13, 2017 at 04:45:20PM +0300, Ilya Maximets wrote:
>>>>>> That
>>>>>> looks very strange. Some of the functions gets 'old_status', others
>>>>>> the 'new_status'. I'm a bit confused.
>>>>>
>>>>> OK, fair enough. Fixed - let's pass old status everywhere,
>>>>> users that need the new one can get it from the vdev.
>>>>>
>>>>>> And it's not functional in current state:
>>>>>>
>>>>>> hw/net/virtio-net.c:264:28: error: ‘status’ undeclared
>>>>>
>>>>> Fixed too. new version below.
>>>>
>>>> This doesn't fix the segmentation fault.
>>>
>>> Hmm you are right. Looking into it.
>>>
>>>> I have exactly same crash stacktrace:
>>>>
>>>> #0 vhost_memory_unmap hw/virtio/vhost.c:446
>>>> #1 vhost_virtqueue_stop hw/virtio/vhost.c:1155
>>>> #2 vhost_dev_stop hw/virtio/vhost.c:1594
>>>> #3 vhost_net_stop_one hw/net/vhost_net.c:289
>>>> #4 vhost_net_stop hw/net/vhost_net.c:368
>>>> #5 virtio_net_vhost_status (old_status=15 '\017', n=0x5625f3901100) at
>>>> hw/net/virtio-net.c:180
>>>> #6 virtio_net_set_status (vdev=0x5625f3901100, old_status=<optimized
>>>> out>) at hw/net/virtio-net.c:254
>>>> #7 virtio_set_status (address@hidden, val=<optimized out>) at
>>>> hw/virtio/virtio.c:1152
>>>> #8 virtio_error (vdev=0x5625f3901100, address@hidden "Guest says index %u
>>>> is available") at hw/virtio/virtio.c:2460
>>>
>>> BTW what is causing this? Why is guest avail index corrupted?
>>
>> My testing environment for the issue:
>>
>> * QEMU 2.10.1
>
> Could you try to backport below patch and try again killing OVS?
>
> commit 2ae39a113af311cb56a0c35b7f212dafcef15303
> Author: Maxime Coquelin <address@hidden>
> Date: Thu Nov 16 19:48:35 2017 +0100
>
> vhost: restore avail index from vring used index on disconnection
>
> vhost_virtqueue_stop() gets avail index value from the backend,
> except if the backend is not responding.
>
> It happens when the backend crashes, and in this case, internal
> state of the virtio queue is inconsistent, making packets
> to corrupt the vring state.
>
> With a Linux guest, it results in following error message on
> backend reconnection:
>
> [ 22.444905] virtio_net virtio0: output.0:id 0 is not a head!
> [ 22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5
> [ 22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5
>
> Fixes: 283e2c2adcb8 ("net: virtio-net discards TX data after link down")
> Cc: address@hidden
> Signed-off-by: Maxime Coquelin <address@hidden>
> Reviewed-by: Michael S. Tsirkin <address@hidden>
> Signed-off-by: Michael S. Tsirkin <address@hidden>
>
> commit 2d4ba6cc741df15df6fbb4feaa706a02e103083a
> Author: Maxime Coquelin <address@hidden>
> Date: Thu Nov 16 19:48:34 2017 +0100
>
> virtio: Add queue interface to restore avail index from vring used index
>
> In case of backend crash, it is not possible to restore internal
> avail index from the backend value as vhost_get_vring_base
> callback fails.
>
> This patch provides a new interface to restore internal avail index
> from the vring used index, as done by some vhost-user backend on
> reconnection.
>
> Signed-off-by: Maxime Coquelin <address@hidden>
> Reviewed-by: Michael S. Tsirkin <address@hidden>
> Signed-off-by: Michael S. Tsirkin <address@hidden>
>
>
> Cheers,
> Maxime
>
>
>
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, (continued)
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Michael S. Tsirkin, 2017/12/06
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Ilya Maximets, 2017/12/07
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Michael S. Tsirkin, 2017/12/07
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Michael S. Tsirkin, 2017/12/07
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Ilya Maximets, 2017/12/08
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Michael S. Tsirkin, 2017/12/10
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Ilya Maximets, 2017/12/13
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Michael S. Tsirkin, 2017/12/13
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Ilya Maximets, 2017/12/14
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Maxime Coquelin, 2017/12/14
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop,
Ilya Maximets <=
- Re: [Qemu-devel] [PATCH] vhost: fix crash on virtio_error while device stop, Ilya Maximets, 2017/12/14