qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2 12/19] vhost: add vhost_svq_poll


From: Jason Wang
Subject: Re: [PATCH v2 12/19] vhost: add vhost_svq_poll
Date: Tue, 19 Jul 2022 15:38:32 +0800
User-agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.11.0


在 2022/7/16 01:05, Eugenio Perez Martin 写道:
On Fri, Jul 15, 2022 at 10:48 AM Jason Wang <jasowang@redhat.com> wrote:
On Fri, Jul 15, 2022 at 1:39 PM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
On Fri, Jul 15, 2022 at 5:59 AM Jason Wang <jasowang@redhat.com> wrote:
On Fri, Jul 15, 2022 at 12:32 AM Eugenio Pérez <eperezma@redhat.com> wrote:
It allows the Shadow Control VirtQueue to wait for the device to use the
available buffers.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
---
  hw/virtio/vhost-shadow-virtqueue.h |  1 +
  hw/virtio/vhost-shadow-virtqueue.c | 22 ++++++++++++++++++++++
  2 files changed, 23 insertions(+)

diff --git a/hw/virtio/vhost-shadow-virtqueue.h 
b/hw/virtio/vhost-shadow-virtqueue.h
index 1692541cbb..b5c6e3b3b4 100644
--- a/hw/virtio/vhost-shadow-virtqueue.h
+++ b/hw/virtio/vhost-shadow-virtqueue.h
@@ -89,6 +89,7 @@ void vhost_svq_push_elem(VhostShadowVirtqueue *svq, const 
SVQElement *elem,
  int vhost_svq_add(VhostShadowVirtqueue *svq, const struct iovec *out_sg,
                    size_t out_num, const struct iovec *in_sg, size_t in_num,
                    SVQElement *elem);
+size_t vhost_svq_poll(VhostShadowVirtqueue *svq);

  void vhost_svq_set_svq_kick_fd(VhostShadowVirtqueue *svq, int svq_kick_fd);
  void vhost_svq_set_svq_call_fd(VhostShadowVirtqueue *svq, int call_fd);
diff --git a/hw/virtio/vhost-shadow-virtqueue.c 
b/hw/virtio/vhost-shadow-virtqueue.c
index 5244896358..31a267f721 100644
--- a/hw/virtio/vhost-shadow-virtqueue.c
+++ b/hw/virtio/vhost-shadow-virtqueue.c
@@ -486,6 +486,28 @@ static void vhost_svq_flush(VhostShadowVirtqueue *svq,
      } while (!vhost_svq_enable_notification(svq));
  }

+/**
+ * Poll the SVQ for one device used buffer.
+ *
+ * This function race with main event loop SVQ polling, so extra
+ * synchronization is needed.
+ *
+ * Return the length written by the device.
+ */
+size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
+{
+    do {
+        uint32_t len;
+        SVQElement *elem = vhost_svq_get_buf(svq, &len);
+        if (elem) {
+            return len;
+        }
+
+        /* Make sure we read new used_idx */
+        smp_rmb();
There's already one smp_rmb(0 in vhost_svq_get_buf(). So this seems useless?

That rmb is after checking for new entries with (vq->last_used_idx !=
svq->shadow_used_idx) , to avoid reordering used_idx read with the
actual used entry. So my understanding is
that the compiler is free to skip that check within the while loop.
What do you mean by "that check" here?

The first check of (presumably cached) last_used_idx !=
shadow_used_idx. Right before the SVQ's vring fetch of
svq->vring.used->idx.

Maybe the right solution is to add it in vhost_svq_more_used after the
condition (vq->last_used_idx != svq->shadow_used_idx) is false?
I'm not sure I get the goal of the smp_rmb() here. What barrier does it pair?

It pairs with the actual device memory write.

Note that I'm worried about compiler optimizations or reordering
causing an infinite loop, not that the memory is updated properly.

Let's say we just returned from vhost_svq_add, and avail_idx -
used_idx > 0. The device still did not update SVQ vring used idx, and
qemu is very fast so it completes a full call of vhost_svq_get_buf
before the device is able to increment the used index. We can trace
the full vhost_svq_get_buf without a memory barrier.

If the compiler inlines enough and we delete the new smp_rmb barrier,
this is what it sees:

size_t vhost_svq_poll(VhostShadowVirtqueue *svq)
{
     do {
         more_used = false
         // The next conditional returns false
         if (svq->last_used_idx != svq->shadow_used_idx) {
             goto useful;
         }

         svq->shadow_used_idx = cpu_to_le16(svq->vring.used->idx);

         // next conditional returns false too
         if (!(svq->last_used_idx != svq->shadow_used_idx))
             continue;

useful:
         // actual code to handle new used buffer
         break;
         }
     }
}

And qemu does not need to read again none of the variables since
nothing modifies them within the loop before "useful" tag
(svq->vring.used->idx, svq->last_used_idx, svq->shadow_used_idx). So
it could freely rewrite as:

size_t vhost_svq_poll(VhostShadowVirtqueue *svq) {
     if (svq->last_used_idx == svq->shadow_used_idx &&
         svq->last_used_idx == svq->vring.used->idx) {
             for (;;);
     }
}

That's why I think the right place for the mb is right after caller
code see (potentially cached) last_used_idx == shadow_used_idx, and it
needs to read a value paired with the "device's mb" in the SVQ vring.


I think you need "volatile" instead of the memory barriers. If I understand correctly, you want the load from the memory instead of the registers here.

Thanks



We didn't have that problem before, since we clear event_notifier
right before the do{}while(), and event loop should hit a memory
barrier in the next select / poll / read / whatever syscall to check
that event notifier fd is set again.

Since we are in the busy loop, we will read the for new used_idx for
sure,
I'm not so sure of that, but maybe I've missed something.

I'm sending v3 with this comment pending, so we can iterate faster.

Thanks!

and we can't forecast when the used_idx is committed to memory.





reply via email to

[Prev in Thread] Current Thread [Next in Thread]