[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Qemu-devel] how Windows treats BARs of driver-less devices when other d
From: |
Laszlo Ersek |
Subject: |
[Qemu-devel] how Windows treats BARs of driver-less devices when other devices are hotplugged |
Date: |
Thu, 25 Feb 2016 13:44:54 +0100 |
User-agent: |
Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 |
Hi,
On 02/25/16 12:57, Michael S. Tsirkin wrote:
> ----- Forwarded message from Igor Mammedov <address@hidden> -----
>
> Date: Thu, 11 Feb 2016 16:16:05 +0100
> From: Igor Mammedov <address@hidden>
> To: "Michael S. Tsirkin" <address@hidden>
> To: address@hidden
> Subject: on pci rebalancing
> Message-ID: <address@hidden>
> In-Reply-To: <address@hidden>
>
>>>>> For PCI rebalance to work on Windows, one has to provide working PCI
>>>>> driver
>>>>> otherwise OS will ignore it when rebalancing happens and
>>>>> might map something else over ignored BAR.
>>>>
>>>> Does it disable the BAR then? Or just move it elsewhere?
>>> it doesn't, it just blindly ignores BARs existence and maps BAR of
>>> another device with driver over it.
>>
>> Interesting. On classical PCI this is a forbidden configuration.
>> Maybe we do something that confuses windows?
>> Could you tell me how to reproduce this behaviour?
> #cat > t << EOF
> pci_update_mappings_del
> pci_update_mappings_add
> EOF
>
> #./x86_64-softmmu/qemu-system-x86_64 -snapshot -enable-kvm -snapshot \
> -monitor unix:/tmp/m,server,nowait -device pci-bridge,chassis_nr=1 \
> -boot menu=on -m 4G -trace events=t ws2012r2x64dc.img \
> -device ivshmem,id=foo,size=2M,shm,bus=pci.1,addr=01
>
> wait till OS boots, note BARs programmed for ivshmem
> in my case it was
> 01:01.0 0,0xfe800000+0x100
> then execute script and watch pci_update_mappings* trace events
>
> # for i in $(seq 3 18); do printf -- "device_add e1000,bus=pci.1,addr=%x\n"
> $i | nc -U /tmp/m; sleep 5; done;
>
> hotplugging e1000,bus=pci.1,addr=12 triggers rebalancing where
> Windows unmaps all BARs of nics on bridge but doesn't touch ivshmem
> and then programs new BARs, where:
> pci_update_mappings_add d=0x7fa02ff0cf90 01:11.0 0,0xfe800000+0x20000
> creates overlapping BAR with ivshmem
Michael informed me of this on IRC (and forwarded this email to me). I hope to
start a new thread with my response. (I also reedited the subject fully.)
So, to summarize what I said on IRC first. The situation where firmware
recognizes and enables a PCI device, hands control to the OS, and then the OS
lacks a driver for the PCI device, is completely normal and expected. For UEFI
specifically, I can name a general argument and a specific argument.
The general argument is that actions that need to be taken in
ExitBootServices() callbacks do not include clearing IO or MMIO decode bits in
PCI device command registers. Command register manipulation happens when a PCI
device driver (that conforms to the UEFI driver model) *binds* or *unbinds* a
device. And unbinding a device is not possible in the ExitBootServices()
callback, minimally because such callbacks are forbidden from modifying the
memory map -- but unbinding would release allocated memory.
So what we use such callbacks for is aborting in-flight, outstanding DMA-like
transfers. Re-setting virtio devices is also an example (think outstanding
receive requests for virtio-net).
Now let's move on to the specific argument I mentioned above. The Graphics
Output Protocol (GOP) is a UEFI abstraction that was specifically designed with
the case in mind when the operating system doesn't have a display driver -- yet
installed --, but the user obviously has to use the display somehow. The GOP is
most frequently provided on top of an EFI_PCI_IO_PROTOCOL instance; meaning
simply that the "GOP driver" is a UEFI driver that drives a PCI device. In
short, the driver provides the GOP on top of a PCI device.
Now, the GOP is supposed to communicate the pixel format and the frame buffer
base address for the currently active graphics mode to the software that
consumes the GOP. This includes UEFI applications of course (think a boot
loader putting up a splash screen or an anmiation), but importantly, the
runtime OS is *also* supposed to inherit these characteristics from boot
services time. The OS can then use simple unaccelerated MMIO writes to display
things on the screen, until the users installs an accelerated driver.
(Concrete example: this is why you can see *anything at all* on the screen,
when you run e.g. Windows Server 2012 R2 on top of OVMF and a QXL display,
before installing the QXL WDDM driver in the guest.)
Clearly, the frame buffer base address communicated through the GOP points into
one of the MMIO BARs of the PCI device. If, at ExitBootServices(), MMIO
decoding were disabled for the PCI device that underlies the GOP, that would
*completely* defeat the GOP design. The OS's attempt to poke at those MMIO
addresses would be futile -- and in fact the OS has no idea what PCI device (if
any) the framebuffer is supposed to be related to. This is the jurisdiction of
the OS-level display driver -- if one exists and is installed.
So, this is a Windows bug in my option. Just because there is no OS-level
driver, a PCI device is fully expected to be decoding resources, if the
firmware brought it up.
--*--
Okay, so Michael asked me to try to reproduce the above with OVMF, and see what
happens. Unfortunately I'm not really knowledgeable about ivshmem, hotplug, et
cetera. Let me instead tell Igor about using OVMF.
(1) Please follow the instructions on Gerd's page
<https://www.kraxel.org/repos/>, and install the "edk2.git-ovmf-x64" package.
(2) Create a separate directory for testing. In this directory, run the
following command:
cp /usr/share/edk2.git/ovmf-x64/OVMF_VARS-pure-efi.fd myvars.fd
Also create a disk image for your new guest, etc.
(3) Use the following command line snippet to work with OVMF:
qemu-system-x86_64 \
-machine accel=kvm \
-smp cpus=2 \
-m 2048 \
\
-debugcon file:ovmf.debug.log \
-global isa-debugcon.iobase=0x402 \
\
-device qxl-vga \
\
-drive
if=pflash,format=raw,unit=0,readonly,file=/usr/share/edk2.git/ovmf-x64/OVMF_CODE-pure-efi.fd
\
-drive if=pflash,format=raw,unit=1,file=myvars.fd \
\
[your options here]
You can of course customize the # of VCPUs, memory size, disks, CD-ROMs,
network, and so on.
Recommended: when you use the -device option to add the disk and the CD-ROM(s)
to install the OS (and driver(s)) from, be sure to use the "bootindex"
property. OVMF will adhere to the boot order. It is recommended to set
bootindex=0 for your main disk, bootindex=1 for your OS installer CD-ROM, and
*no* bootindex for your virtio-win driver disk. This way at first boot (with no
OS installed) OVMF will boot the installer CD-ROM. Further boots (with the same
command line) will boot the installed OS.
Caveat: I never used the -snapshot option with OVMF virtual machines; it might
or might not work.
Caveat #2: I had tested simple PCI hotplug and hot-unplug with Windows running
on OVMF many months ago, but I can't tell off-hand if it will work right now.
Thanks
Laszlo
- [Qemu-devel] how Windows treats BARs of driver-less devices when other devices are hotplugged,
Laszlo Ersek <=