[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates
From: |
Jinhao Fan |
Subject: |
Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates |
Date: |
Thu, 7 Jul 2022 16:50:39 +0800 |
at 1:51 PM, Klaus Jensen <its@irrelevant.dk> wrote:
> On Jul 6 19:34, Jinhao Fan wrote:
>> at 2:43 AM, Keith Busch <kbusch@kernel.org> wrote:
>>
>>> On Tue, Jul 05, 2022 at 07:11:36PM +0200, Klaus Jensen wrote:
>>>> On Jul 5 22:24, Jinhao Fan wrote:
>>>>> @@ -1374,7 +1374,14 @@ static void nvme_enqueue_req_completion(NvmeCQueue
>>>>> *cq, NvmeRequest *req)
>>>>>
>>>>> QTAILQ_REMOVE(&req->sq->out_req_list, req, entry);
>>>>> QTAILQ_INSERT_TAIL(&cq->req_list, req, entry);
>>>>> - timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + 500);
>>>>> +
>>>>> + if (req->sq->ioeventfd_enabled) {
>>>>> + /* Post CQE directly since we are in main loop thread */
>>>>> + nvme_post_cqes(cq);
>>>>> + } else {
>>>>> + /* Schedule the timer to post CQE later since we are in vcpu
>>>>> thread */
>>>>> + timer_mod(cq->timer, qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) +
>>>>> 500);
>>>>> + }
>>>>
>>>> Actually, we are only in the vcpu thread if we come here from
>>>> nvme_process_db that in very rare circumstances may enqueue the
>>>> completion of an AER due to an invalid doorbell write.
>>>>
>>>> In general, nvme_enqueue_req_completion is only ever called from the
>>>> main iothread. Which actually causes me to wonder why we defer this work
>>>> in the first place. It does have the benefit that we queue up several
>>>> completions before posting them in one go and raising the interrupt.
>>>> But I wonder if that could be handled better.
>>>
>>> I think the timer is used because of the cq_full condition. We need to
>>> restart
>>> completions when it becomes not full, which requires a doorbell write.
>>> Having
>>> everyone from the main iothread use the same timer as the doorbell handler
>>> just
>>> ensures single threaded list access.
>>
>> Could we let nvme_process_aers register another timer/BH to trigger
>> nvme_enqueue_req_completion in the iothread? In this way we won’t need the
>> timer_mod in nvme_enqueue_req_completion.
>
> Yes, we could have process_aers in a timer. Which would probably be
> preferable in order to limit the amount of work the mmio handler is
> doing in that rare case. However, its such a rare case (only misbehaving
> drivers) that it's probably not worth optimizing for.
I think putting nvme_process_aers in a timer can help make sure
nvme_enqueue_req_completion is only called in an iothread context. Currently
it can be called either in iothread or vcpu thread. So
nvme_enqueue_req_completion has to infer its context, in my patch using the
cq->ioeventfd_enabled flag. This approach is probably error-prone. Honestly
I am not sure whether cq->ioeventfd_enabled is enough to guarantee we are in
iothread.
>> We can also avoid some potential currency problems because CQ is only
>> modified in the iothread.
>
> There are currently no concurrency problems because of the Big QEMU
> Lock. When the mmio handler is running, the vcpu holds the BQL (and
> whenever the main iothread is running, it is holding the BQL).
Will this still hold when we move on to iothreads?
>
>> BTW, are there any reason that we must use timers (not BH) here? Also why do
>> we choose to delay for 500ns?
>
> No particular reason. do not see any reason why this could not be bottom
> halfs. This will likely change into bhs when we add iothread support
> anyway.
I will try BH when I start the iothread work.
Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates, Jinhao Fan, 2022/07/08
Re: [PATCH v4] hw/nvme: Use ioeventfd to handle doorbell updates, Klaus Jensen, 2022/07/25