qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH][RESEND v5 3/3] Add a Hyper-V Dynamic Memory Protocol driver


From: David Hildenbrand
Subject: Re: [PATCH][RESEND v5 3/3] Add a Hyper-V Dynamic Memory Protocol driver (hv-balloon)
Date: Fri, 23 Jun 2023 14:23:58 +0200
User-agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.12.0

On 22.06.23 20:45, Maciej S. Szmigiero wrote:
On 22.06.2023 14:52, David Hildenbrand wrote:
On 22.06.23 14:14, Maciej S. Szmigiero wrote:
On 22.06.2023 14:06, David Hildenbrand wrote:
On 22.06.23 13:17, Maciej S. Szmigiero wrote:
On 22.06.2023 13:15, David Hildenbrand wrote:
On 22.06.23 13:12, Maciej S. Szmigiero wrote:
On 22.06.2023 13:01, David Hildenbrand wrote:
[...]

We'd use a memory region container as device memory region (like [1]) and would 
have to handle the !memdev case (I can help with that). > Into that, you can 
map the RAM memory region on demand (and eventually even using multiple slots like 
[1]).

(2) Use a single virtual DIMM and (un)plug that on demand. Let the machine code 
handle (un)plugging of the device.


(1) feels cleanest to me, although it will require a bit more work.


I also think approach (1) makes more sense as it avoids memslot metadata
overhead for not-yet-hot-added parts of the memory backing device.

Not sure what you mean that the !memdev case would be problematic in this
case - it is working in the current driver shape so why would adding
potential memory subregions (used in the memdev case) change that?

I'm thinking about the case where you have a hv-balloon device without a memdev.

Without -m X,maxmem=y we don't currently expect to have memory devices around
(and especially them getting (un)plugged. But why should we "force" to set the
"maxmem" option

I guess it's only a small change to QEMU to allow having hv-balloon
device (without a memdev) even in the case where there's no "maxmem"
option given on the QEMU command line.


I hope I'll find some time soonish to prototype what I have in mind, to see
if it could be made working.


Okay, so I'll wait for your prototype before commencing further work on
the next version of this driver.

About to have something simplistic running -- I think. Want to test with a 
Linux VM, but I don't seem to get it working (also without my changes).


#!/bin/bash

build/qemu-system-x86_64 \
         --enable-kvm \
         -m 4G,maxmem=36G \
         -cpu host,hv-syndbg=on,hv-synic,hv-relaxed,hv-vpindex \
         -smp 16 \
         -nographic \
         -nodefaults \
         -net nic -net user \
         -chardev stdio,nosignal,id=serial \
         -hda Fedora-Cloud-Base-37-1.7.x86_64.qcow2 \
         -cdrom /home/dhildenb/git/cloud-init/cloud-init.iso \
         -device isa-serial,chardev=serial \
         -chardev socket,id=monitor,path=/var/tmp/mon_src,server,nowait \
         -mon chardev=monitor,mode=readline \
         -device vmbus-bridge \
         -object memory-backend-ram,size=2G,id=mem0 \
         -device hv-balloon,id=hv1,memdev=mem0



[root@vm-0 ~]# uname -r
6.3.5-100.fc37.x86_64
[root@vm-0 ~]# modprobe hv_balloon
modprobe: ERROR: could not insert 'hv_balloon': No such device


Any magic flag I am missing? Or is there something preventing this to work with 
Linux VMs?


Haven't tested the driver with Linux guests in a long time (as it is
targeting Windows), but I think you need to disable KVM PV interface for
the Hyper-V one to be detected by Linux.

Something like adding "kvm=off" to "-cpu" and seeing in the dmesg whether
the detected hypervisor is now Hyper-V.

Also, you need to disable S4 in the guest for hot-add capability to work
(I'm adding "-global ICH9-LPC.disable_s4=1" with q35 machine for this).

Would also suggest adding "--trace 'hv_balloon_*' --trace 'memory_device_*'"
to QEMU command line to see what's happening.

VM is not happy:

[    1.908595] BUG: kernel NULL pointer dereference, address: 0000000000000007
[    1.908837] #PF: supervisor read access in kernel mode
[    1.908837] #PF: error_code(0x0000) - not-present page
[    1.908837] PGD 0 P4D 0
[    1.908837] Oops: 0000 [#1] PREEMPT SMP NOPTI
[    1.908837] CPU: 13 PID: 492 Comm: (udev-worker) Not tainted 
6.3.5-100.fc37.x86_64 #1
[    1.908837] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
rel-1.16.2-0-gea1b7a073390-p4
[    1.908837] RIP: 0010:acpi_ns_lookup+0x8f/0x4c0
[    1.908837] Code: 8b 3d f5 eb 1c 03 83 05 52 ec 1c 03 01 48 85 ff 0f 84 51 
03 00 00 44 89 c3 4c 89 cb
[    1.908837] RSP: 0018:ffff95b680ad7950 EFLAGS: 00010286
[    1.908837] RAX: ffff95b680ad79e0 RBX: 0000000000000002 RCX: 0000000000000003
[    1.908837] RDX: 0000000000000000 RSI: ffff8a0283a3c558 RDI: ffffffffa4b376e0
[    1.908837] RBP: 0000000000000000 R08: 0000000000000002 R09: 0000000000000000
[    1.908837] R10: ffff8a02811034ec R11: 0000000000000000 R12: ffffffffffffffff
[    1.908837] R13: ffff8a02811034e8 R14: ffff8a02811034e8 R15: 0000000000000000
[    1.908837] FS:  00007f3bb2e7d0c0(0000) GS:ffff8a02bbd40000(0000) 
knlGS:0000000000000000
[    1.908837] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    1.908837] CR2: 0000000000000007 CR3: 0000000100a58002 CR4: 0000000000770ee0
[    1.908837] PKRU: 55555554
[    1.908837] Call Trace:
[    1.908837]  <TASK>
[    1.908837]  ? __die+0x23/0x70
[    1.908837]  ? page_fault_oops+0x171/0x4e0
[    1.908837]  ? prepare_alloc_pages.constprop.0+0xf6/0x1a0
[    1.908837]  ? exc_page_fault+0x74/0x170
[    1.908837]  ? asm_exc_page_fault+0x26/0x30
[    1.908837]  ? acpi_ns_lookup+0x8f/0x4c0
[    1.908837]  acpi_ns_get_node_unlocked+0xdd/0x110
[    1.908837]  ? down_timeout+0x3e/0x60
[    1.908837]  ? acpi_ns_get_node+0x3e/0x60
[    1.908837]  acpi_ns_get_node+0x3e/0x60
[    1.908837]  acpi_ns_evaluate+0x1cb/0x2d0
[    1.908837]  acpi_ut_evaluate_object+0x68/0x1c0
[    1.908837]  acpi_rs_get_method_data+0x37/0x80
[    1.908837]  ? __pfx_vmbus_walk_resources+0x10/0x10 [hv_vmbus]
[    1.908837]  acpi_walk_resources+0x91/0xe0
[    1.908837]  vmbus_acpi_add+0x87/0x170 [hv_vmbus]
[    1.908837]  acpi_device_probe+0x47/0x160
[    1.908837]  really_probe+0x19f/0x400
[    1.908837]  ? __pfx___driver_attach+0x10/0x10
[    1.908837]  __driver_probe_device+0x78/0x160
[    1.908837]  driver_probe_device+0x1f/0x90
[    1.908837]  __driver_attach+0xd2/0x1c0
[    1.908837]  bus_for_each_dev+0x85/0xd0
[    1.908837]  bus_add_driver+0x116/0x220
[    1.908837]  driver_register+0x59/0x100
[    1.908837]  ? __pfx_init_module+0x10/0x10 [hv_vmbus]
[    1.908837]  hv_acpi_init+0x39/0xff0 [hv_vmbus]
[    1.908837]  ? __pfx_init_module+0x10/0x10 [hv_vmbus]
[    1.908837]  do_one_initcall+0x5a/0x240
[    1.908837]  do_init_module+0x4a/0x210
[    1.908837]  __do_sys_init_module+0x17f/0x1b0
[    1.908837]  do_syscall_64+0x5c/0x90
[    1.908837]  ? handle_mm_fault+0x11e/0x310
[    1.908837]  ? do_user_addr_fault+0x1e0/0x720
[    1.908837]  ? exc_page_fault+0x74/0x170
[    1.908837]  entry_SYSCALL_64_after_hwframe+0x72/0xdc


I guess *few* people run Linux with QEMU Hyper-V interfaces
implementation..
Maybe I'll have to install a Windows guest :/

I think that makes more sense, since we're targeting Windows anyway.


Having installed fairly recent Win10 and running with master+your patches, I 
still can't get it to work.

Windows is stuck booting (before the little circle starts turning).

Removing the hv-balloon device makes it work again (well, at least the circle 
spins again my windows installation now seems to be broken and I have to 
reinstall ... windows).

Do you have a working cmdline for Windows I can try?


Do you get any tracing output from the driver when it is loaded or does its 
mere presence
make your Windows installation fail to boot?

I'm was testing using "qemu-system-x86_64 -machine q35,accel=kvm \
             -cpu 
host,vmx,-mpx,-pmu,check,hv-crash,hv-relaxed,hv-time,hv-vapic,hv-spinlocks=0x1fff,hv-

^ That magic collection of flags seems to make Win10 happy.

I will try to determine the minimum set of Hyper-V flags required
for Windows guests since having to manually specify the "right"
set seems like a poor UX (and also try to determine why Linux
guest crashes in ACPI when booted with above flags).

But I think both of these problems are related more to VMBus support
itself rather than hv-balloon.

That's also my feeling. It would be great to extend

https://www.qemu.org/docs/master/system/i386/hyperv.html

and include some of the dirty details in you next submission, so it's easier to get running.

[...]

Maybe the VM needs some time after start-up to actually
process requests.

Yes, there's a waiting period immediately after the guest
boot before executing any protocol operations in the guest:
/*
  * Some Windows versions (at least Server 2019) will crash with various
  * error codes when receiving DM protocol requests (at least
  * DM_MEM_HOT_ADD_REQUEST) immediately after boot.
  *
  * It looks like Hyper-V from Server 2016 uses a 50-second after-boot
  * delay, probably to workaround this issue, so we'll use this value, too.
  */
#define HV_BALLOON_POST_INIT_WAIT (50 * 1000)


That's why I suggested running with tracing on, since then
you can observe the device waiting in the "POST_INIT_WAIT" state
(and also how things are proceeding in general).

Ah, makes sense.

--
Cheers,

David / dhildenb




reply via email to

[Prev in Thread] Current Thread [Next in Thread]