[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PULL 52/53] vhost_net: add an assertion for TAP client backends
From: |
Michael S. Tsirkin |
Subject: |
Re: [PULL 52/53] vhost_net: add an assertion for TAP client backends |
Date: |
Wed, 28 Jun 2023 06:50:47 -0400 |
On Wed, Jun 28, 2023 at 04:03:44PM +0530, Ani Sinha wrote:
>
>
> > On 28-Jun-2023, at 1:00 PM, Cédric Le Goater <clg@redhat.com> wrote:
> >
> > On 6/28/23 08:45, Ani Sinha wrote:
> >>> On 28-Jun-2023, at 11:58 AM, Cédric Le Goater <clg@redhat.com> wrote:
> >>>
> >>> Hello,
> >>>
> >>> On 6/26/23 14:30, Michael S. Tsirkin wrote:
> >>>> From: Ani Sinha <anisinha@redhat.com>
> >>>> An assertion was missing for tap vhost backends that enforces a non-null
> >>>> reference from get_vhost_net(). Both vhost-net-user and vhost-net-vdpa
> >>>> enforces this. Enforce the same for tap. Unit tests pass with this
> >>>> change.
> >>>> Signed-off-by: Ani Sinha <anisinha@redhat.com>
> >>>> Message-Id: <20230619041501.111655-1-anisinha@redhat.com>
> >>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> >>>> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
> >>>> Reviewed-by: Laurent Vivier <lvivier@redhat.com>
> >>>> ---
> >>>> hw/net/vhost_net.c | 1 +
> >>>> 1 file changed, 1 insertion(+)
> >>>> diff --git a/hw/net/vhost_net.c b/hw/net/vhost_net.c
> >>>> index c4eecc6f36..6db23ca323 100644
> >>>> --- a/hw/net/vhost_net.c
> >>>> +++ b/hw/net/vhost_net.c
> >>>> @@ -507,6 +507,7 @@ VHostNetState *get_vhost_net(NetClientState *nc)
> >>>> switch (nc->info->type) {
> >>>> case NET_CLIENT_DRIVER_TAP:
> >>>> vhost_net = tap_get_vhost_net(nc);
> >>>> + assert(vhost_net);
> >>>> break;
> >>>> #ifdef CONFIG_VHOST_NET_USER
> >>>> case NET_CLIENT_DRIVER_VHOST_USER:
> >>>
> >>> A system of mine without vhost_net (old host kernel) is reaching this
> >>> assert
> >> We need to understand why this assertion is being hit. It could be a bug
> >> somewhere else.
> >
> > sure.
> >
> >> What is the backtrace?
> >
> >
> > #0 __pthread_kill_implementation (threadid=549621125152,
> > signo=signo@entry=6, no_tid=no_tid@entry=0) at ./nptl/pthread_kill.c:44
> > #1 0x0000007ff71df254 in __pthread_kill_internal (signo=6,
> > threadid=<optimized out>) at ./nptl/pthread_kill.c:78
> > #2 0x0000007ff719a67c in __GI_raise (sig=sig@entry=6) at
> > ../sysdeps/posix/raise.c:26
> > #3 0x0000007ff7187130 in __GI_abort () at ./stdlib/abort.c:79
> > #4 0x0000007ff7193fd0 in __assert_fail_base
> > (fmt=0x7ff72ad3f8 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n",
> > assertion=assertion@entry=0x5556208c00 "vhost_net",
> > file=file@entry=0x5556208ae0 "../hw/net/vhost_net.c", line=line@entry=510,
> > function=function@entry=0x5556208de8 <__PRETTY_FUNCTION__.2>
> > "get_vhost_net") at ./assert/assert.c:92
> > #5 0x0000007ff7194040 in __GI___assert_fail
> > (assertion=assertion@entry=0x5556208c00 "vhost_net",
> > file=file@entry=0x5556208ae0 "../hw/net/vhost_net.c", line=line@entry=510,
> > function=function@entry=0x5556208de8 <__PRETTY_FUNCTION__.2>
> > "get_vhost_net") at ./assert/assert.c:101
> > #6 0x0000005555a767ac in get_vhost_net (nc=<optimized out>) at
> > ../hw/net/vhost_net.c:510
> > #7 0x0000005555e6aea8 in virtio_net_get_features (vdev=0x55578cc770,
> > features=1105849221095, errp=<optimized out>) at ../hw/net/virtio-net.c:807
> > #8 0x0000005555b69da4 in virtio_bus_device_plugged
> > (vdev=vdev@entry=0x55578cc770, errp=errp@entry=0x7fffffe640) at
> > ../hw/virtio/virtio-bus.c:66
> > #9 0x0000005555e8b6a0 in virtio_device_realize (dev=0x55578cc770,
> > errp=0x7fffffe6c0) at ../hw/virtio/virtio.c:3609
> > #10 0x0000005555f1ead8 in device_set_realized (obj=0x55578cc770,
> > value=<optimized out>, errp=0x7fffffe898) at ../hw/core/qdev.c:510
> > #11 0x0000005555f22a14 in property_set_bool (obj=0x55578cc770, v=<optimized
> > out>, name=<optimized out>, opaque=0x5556be9ad0, errp=0x7fffffe898) at
> > ../qom/object.c:2285
> > #12 0x0000005555f260fc in object_property_set (obj=obj@entry=0x55578cc770,
> > name=name@entry=0x555621c280 "realized", v=v@entry=0x55578d9310,
> > errp=errp@entry=0x7fffffe898)
> > at ../qom/object.c:1420
> > #13 0x0000005555f2a85c in object_property_set_qobject
> > (obj=obj@entry=0x55578cc770, name=name@entry=0x555621c280 "realized",
> > value=value@entry=0x55578d9250, errp=errp@entry=0x7fffffe898) at
> > ../qom/qom-qobject.c:28
> > #14 0x0000005555f26820 in object_property_set_bool (obj=0x55578cc770,
> > name=0x555621c280 "realized", value=<optimized out>, errp=0x7fffffe898) at
> > ../qom/object.c:1489
> > #15 0x0000005555aafd24 in pci_qdev_realize (qdev=<optimized out>,
> > errp=<optimized out>) at ../hw/pci/pci.c:2116
> > #16 0x0000005555f1ead8 in device_set_realized (obj=0x55578c43a0,
> > value=<optimized out>, errp=0x7fffffeae8) at ../hw/core/qdev.c:510
> > #17 0x0000005555f22a14 in property_set_bool (obj=0x55578c43a0, v=<optimized
> > out>, name=<optimized out>, opaque=0x5556be9ad0, errp=0x7fffffeae8) at
> > ../qom/object.c:2285
> > #18 0x0000005555f260fc in object_property_set (obj=obj@entry=0x55578c43a0,
> > name=name@entry=0x555621c280 "realized", v=v@entry=0x55578d34c0,
> > errp=errp@entry=0x7fffffeae8)
> > at ../qom/object.c:1420
> > #19 0x0000005555f2a85c in object_property_set_qobject
> > (obj=obj@entry=0x55578c43a0, name=name@entry=0x555621c280 "realized",
> > value=value@entry=0x55578d3450, errp=errp@entry=0x7fffffeae8) at
> > ../qom/qom-qobject.c:28
> > #20 0x0000005555f26820 in object_property_set_bool (obj=0x55578c43a0,
> > name=0x555621c280 "realized", value=<optimized out>, errp=0x7fffffeae8) at
> > ../qom/object.c:1489
> > #21 0x0000005555ba4f7c in qdev_device_add_from_qdict
> > (opts=opts@entry=0x55578c3110, from_json=from_json@entry=false,
> > errp=0x7fffffeae8, errp@entry=0x5556b24a28 <error_fatal>)
> > at ../softmmu/qdev-monitor.c:714
> > #22 0x0000005555ba5170 in qdev_device_add (opts=0x5556be6060,
> > errp=errp@entry=0x5556b24a28 <error_fatal>) at ../softmmu/qdev-monitor.c:733
> > #23 0x0000005555baa09c in device_init_func (opaque=<optimized out>,
> > opts=<optimized out>, errp=0x5556b24a28 <error_fatal>) at
> > ../softmmu/vl.c:1152
> > #24 0x00000055560b2124 in qemu_opts_foreach
> > (list=<optimized out>, func=func@entry=0x5555baa080 <device_init_func>,
> > opaque=opaque@entry=0x0, errp=errp@entry=0x5556b24a28 <error_fatal>) at
> > ../util/qemu-option.c:1135
> > #25 0x0000005555bac88c in qemu_create_cli_devices () at ../softmmu/vl.c:2573
> > #26 qmp_x_exit_preconfig (errp=<optimized out>) at ../softmmu/vl.c:2641
> > #27 0x0000005555bafee4 in qmp_x_exit_preconfig (errp=<optimized out>) at
> > ../softmmu/vl.c:2635
> > #28 qemu_init (argc=<optimized out>, argv=<optimized out>) at
> > ../softmmu/vl.c:3648
> > #29 0x00000055558e839c in main (argc=<optimized out>, argv=<optimized out>)
> > at ../softmmu/main.c:47
> >
> >> What is the repro case?
> >
>
> I have been able to repro this even on x86, using the following command line:
>
> ./qemu-system-x86_64 \
> -accel kvm \
> -machine pc-q35-8.0 \
> -m 8192 \
> -nodefaults \
> -boot strict=on \
> -nographic \
> -device virtio-net,netdev=net0,mac=C0:FF:EE:00:00:12,bus=pcie.0,addr=0x3 \
> -netdev
> tap,id=net0,helper=/usr/libexec/qemu-bridge-helper,br=virbr0,vhost=off \
> -monitor stdio \
> -qmp tcp:0:5555,server,nowait \
> -vnc :0
>
>
> > Nothing special :
> >
> > qemu-system-aarch64 -M virt -accel kvm -cpu host -nographic -m 2G \
> > -drive if=pflash,format=raw,file=./EFI/efi.img,readonly=on \
> > -drive if=pflash,format=raw,file=fedora-varstore.img \
> > -device virtio-net,netdev=net0,mac=C0:FF:EE:00:00:12,bus=pcie.0,addr=0x3 \
> > -netdev
> > tap,id=net0,helper=/usr/lib/qemu/qemu-bridge-helper,br=virbr0,vhost=off \
>
> ^^^^^^^^^^^
> This is what is causing the crash. With vhost=off or without that option (and
> no vhostfd provided or forced) , within net_init_tap_one(), tap->vhost is
> False (when vhost=off) or tap->has_vhost is False (when no vhost option is
> provided). Thus vhost_net_init() will never be called since the following
> conditional evaluates to false :
>
> if (tap->has_vhost ? tap->vhost :
> vhostfdname || (tap->has_vhostforce && tap->vhostforce)) {
>
> and s->vhost_net pointer will remain NULL.
>
> The crash does not happen when passing vhost=on in the commandline.
>
> Its sad we did not have a test for it and did not catch this scenario.
>
> I will replace the assert() with a comment as to why the assert() is absent
> in tap network case.
>
> > -drive file=fedora-arm64.qcow2,if=none,id=disk,format=qcow2,cache=none \
> > -device virtio-blk-device,drive=disk \
> > -serial mon:stdio
> > qemu-system-aarch64: ../hw/net/vhost_net.c:510: get_vhost_net: Assertion
> > `vhost_net' failed.
>
Thanks! Pls post quickly and I'll do a pull with just this fixup, so
people don't suffer from the regression.
--
MST
- [PULL 51/53] intel_iommu: Fix address space unmap, (continued)
- [PULL 51/53] intel_iommu: Fix address space unmap, Michael S. Tsirkin, 2023/06/26
- [PULL 34/53] vdpa: do not block migration if device has cvq and x-svq=on, Michael S. Tsirkin, 2023/06/26
- [PULL 39/53] hw/net/virtio-net: make some VirtIONet const, Michael S. Tsirkin, 2023/06/26
- [PULL 40/53] virtio-net: expose virtio_net_supported_guest_offloads(), Michael S. Tsirkin, 2023/06/26
- [PULL 46/53] hw/i386/pc: Clean up pc_machine_initfn, Michael S. Tsirkin, 2023/06/26
- [PULL 52/53] vhost_net: add an assertion for TAP client backends, Michael S. Tsirkin, 2023/06/26
[PULL 49/53] intel_iommu: Fix a potential issue in VFIO dirty page sync, Michael S. Tsirkin, 2023/06/26
[PULL 22/53] hw/virtio: Build various target-agnostic objects just once, Michael S. Tsirkin, 2023/06/26
[PULL 32/53] tests/data/acpi: update after SMBIOS 2.0 change, Michael S. Tsirkin, 2023/06/26
[PULL 35/53] vdpa: reorder vhost_vdpa_net_cvq_cmd_page_len function, Michael S. Tsirkin, 2023/06/26
[PULL 36/53] vdpa: map shadow vrings with MAP_SHARED, Michael S. Tsirkin, 2023/06/26
[PULL 43/53] vhost: fix vhost_dev_enable_notifiers() error case, Michael S. Tsirkin, 2023/06/26
[PULL 45/53] vdpa: fix not using CVQ buffer in case of error, Michael S. Tsirkin, 2023/06/26
[PULL 44/53] vdpa: mask _F_CTRL_GUEST_OFFLOADS for vhost vdpa devices, Michael S. Tsirkin, 2023/06/26
[PULL 47/53] virtio-scsi: avoid dangling host notifier in ->ioeventfd_stop(), Michael S. Tsirkin, 2023/06/26
[PULL 50/53] intel_iommu: Fix flag check in replay, Michael S. Tsirkin, 2023/06/26