qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] CC wangxiongfeng. : RE: [PATCH] pcie: fix device hotplu


From: Xiongfeng Wang
Subject: Re: [Qemu-devel] CC wangxiongfeng. : RE: [PATCH] pcie: fix device hotplug failure at the meantime of VM boot
Date: Thu, 25 Jul 2019 15:43:50 +0800
User-agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0


On 2019/7/25 11:19, Zhangbo (Oscar) wrote:
>>> If the PCI_EXP_LNKSTA_DLLLA capability is set by default, linux kernel will 
>>> send
>>> PDC event to detect whether there is a device in pcie slot. If a device is 
>>> pluged
>>> in the pcie-root-port at the same time, hot-plug device will send ABP + PDC
>>> events to the kernel. The VM kernel will wrongly unplug the device if two 
>>> PDC
>>> events get too close. Thus we'd better set the PCI_EXP_LNKSTA_DLLLA
>>> capability only in hotplug callback.
>>
>> Could you please describe a reproducer in a bit more detail?
>>
> Step1: start a VM with qemu, the VM boots up within 500ms.
>   /path/to/qemu-2.8.1/aarch64-softmmu/qemu-system-aarch64 \
>   -name test-c65961652639ccf9ce0b8476a325421811d4fdc873e90c27168497bc9e204776 
> \
>   -uuid a8ed4a86-3f49-45a3-a8ce-28d61b2f2914 \
>   -machine virt,usb=off,accel=kvm,gic-version=3 \
>   -cpu host \
>   -m 2048M,slots=10,maxmem=239477M \
>   -qmp unix:/var/run/qmp.sock,server,nowait \
>   -device 
> pcie-root-port,port=0x8,chassis=1,id=pci.1,bus=pcie.0,multifunction=on,addr=0x1
>  \
>   -device pcie-root-port,port=0x9,chassis=2,id=pci.2,bus=pcie.0,addr=0x1.0x1 \
>   -device pcie-root-port,port=0xa,chassis=3,id=pci.3,bus=pcie.0,addr=0x1.0x2 \
>   -device pcie-root-port,port=0xb,chassis=4,id=pci.4,bus=pcie.0,addr=0x1.0x3 \
>   -device pcie-root-port,port=0xc,chassis=5,id=pci.5,bus=pcie.0,addr=0x1.0x4 \
>   -device pcie-root-port,port=0xd,chassis=6,id=pci.6,bus=pcie.0,addr=0x1.0x5 \
>   -device pcie-root-port,port=0xe,chassis=7,id=pci.7,bus=pcie.0,addr=0x1.0x6 \
>   -device pcie-pci-bridge,id=pci.8,bus=pci.1,addr=0x0 \
>   -device pcie-root-port,port=0xf,chassis=9,id=pci.9,bus=pcie.0,addr=0x1.0x7 \
>   .......
>   
> Step2: Immediately hotplug a pcie device:
>   qmp_msg='{ "execute": "qmp_capabilities" }
> {"arguments":{"addr":"0x0","bus":"pci.4","driver":"virtio-net-pci","id":"virtio-e1356802-4b9f-44bb-b8f0-5f98bf765823","mac":"02:42:20:6e:a2:59"},"execute":"device_del"}
> {"arguments":{"id":"netport_test_1","ifname":"nfs_tap"},"execute":"netdev_del"}'
> 
>   echo $qmp_msg | nc -U /var/run/qmp.sock
> 
> Result expected:  hotplug successful, the pcie device could be seen inside 
> the VM
> 
> Result in fact: we found a "hotplug" and "unplug" message inside the VM, it 
> failed in hotplug.
> 

I think it's because the device is plugged into the system before the pciehp 
modules  is loaded.
When pciehp module is being loaded, it will firstly check whether the device is 
present in pciehp_probe()->pciehp_check_presence().
In our situation, the link is active, so the device is emumerated and added 
into the system.
But the slot may not be powered on yet.
After pciehp interrupt is initialized, an ABP event is received and the device 
is plugged out.

>>
>>>
>>> By the way, we should clean up the PCI_EXP_LNKSTA_DLLLA capability during
>>> unplug to avoid VM restart or migration failure which will enter the same
>>> abnormal scenario as above.
>>>
>>> Signed-off-by: address@hidden
>>> Signed-off-by: address@hidden
>>> Signed-off-by: address@hidden
>>
>> So looking at linux I see:
>>
>> * pciehp_card_present_or_link_active() - whether given slot is occupied
>> * @ctrl: PCIe hotplug controller
>> *
>> * Unlike pciehp_card_present(), which determines presence solely from the
>> * Presence Detect State bit, this helper also returns true if the Link Active
>> * bit is set.  This is a concession to broken hotplug ports which hardwire
>> * Presence Detect State to zero, such as Wilocity's [1ae9:0200].
>>
>> so it looks like linux actually looks at presence detect state,
>> but we have a bug just like Wilocity's and keeping
>> link active up fixes that. Can't we fix the bug instead?
>>
> @wangxiongfeng 
>>> ---
>>>  hw/pci/pcie.c | 9 +++++----
>>>  1 file changed, 5 insertions(+), 4 deletions(-)
>>>
>>> diff --git a/hw/pci/pcie.c b/hw/pci/pcie.c
>>> index a6beb56..174f392 100644
>>> --- a/hw/pci/pcie.c
>>> +++ b/hw/pci/pcie.c
>>> @@ -75,10 +75,6 @@ pcie_cap_v1_fill(PCIDevice *dev, uint8_t port, uint8_t
>> type, uint8_t version)
>>>
>> QEMU_PCI_EXP_LNKSTA_NLW(QEMU_PCI_EXP_LNK_X1) |
>>>
>> QEMU_PCI_EXP_LNKSTA_CLS(QEMU_PCI_EXP_LNK_2_5GT));
>>>
>>> -    if (dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA) {
>>> -        pci_word_test_and_set_mask(exp_cap + PCI_EXP_LNKSTA,
>>> -                                   PCI_EXP_LNKSTA_DLLLA);
>>> -    }
>>>
>>>      /* We changed link status bits over time, and changing them across
>>>       * migrations is generally fine as hardware changes them too.
>>
>> Does this actually change anything?
>>
>> I don't know why do we bother setting it here but we do
>> set it later in pcie_cap_slot_plug_cb, correct?
>>
>> I'd like to understand whether this is part of fix or
>> just a cleanup.
>>
>>
>>> @@ -484,6 +480,11 @@ void
>> pcie_cap_slot_unplug_request_cb(HotplugHandler *hotplug_dev,
>>>          return;
>>>      }
>>>
>>> +    if (pci_dev->cap_present & QEMU_PCIE_LNKSTA_DLLLA) {
>>> +        pci_word_test_and_clear_mask(exp_cap + PCI_EXP_LNKSTA,
>>> +                                     PCI_EXP_LNKSTA_DLLLA);
>>> +    }
>>> +
>>>      pcie_cap_slot_push_attention_button(PCI_DEVICE(hotplug_dev));
>>>  }
>>
>> So this reports data link inactive immediately after
>> unplug request. Seems a bit questionable: guest did not
>> respond yet. I'd like to see a comment explaining why
>> this makes sense.
>>
>>
>>> --
>>> 1.8.3.1




reply via email to

[Prev in Thread] Current Thread [Next in Thread]