qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH v4 20/20] vdpa: Add custom IOTLB translations to SVQ


From: Eugenio Perez Martin
Subject: Re: [RFC PATCH v4 20/20] vdpa: Add custom IOTLB translations to SVQ
Date: Fri, 15 Oct 2021 11:14:24 +0200

On Fri, Oct 15, 2021 at 10:20 AM Eugenio Perez Martin
<eperezma@redhat.com> wrote:
>
> On Fri, Oct 15, 2021 at 9:37 AM Jason Wang <jasowang@redhat.com> wrote:
> >
> > On Fri, Oct 15, 2021 at 3:28 PM Eugenio Perez Martin
> > <eperezma@redhat.com> wrote:
> > >
> > > On Wed, Oct 13, 2021 at 7:34 AM Jason Wang <jasowang@redhat.com> wrote:
> > > >
> > > >
> > > > 在 2021/10/1 下午3:06, Eugenio Pérez 写道:
> > > > > Use translations added in VhostIOVATree in SVQ.
> > > > >
> > > > > Now every element needs to store the previous address also, so 
> > > > > VirtQueue
> > > > > can consume the elements properly. This adds a little overhead per VQ
> > > > > element, having to allocate more memory to stash them. As a possible
> > > > > optimization, this allocation could be avoided if the descriptor is 
> > > > > not
> > > > > a chain but a single one, but this is left undone.
> > > > >
> > > > > TODO: iova range should be queried before, and add logic to fail when
> > > > > GPA is outside of its range and memory listener or svq add it.
> > > > >
> > > > > Signed-off-by: Eugenio Pérez<eperezma@redhat.com>
> > > > > ---
> > > > >   hw/virtio/vhost-shadow-virtqueue.h |   4 +-
> > > > >   hw/virtio/vhost-shadow-virtqueue.c | 130 
> > > > > ++++++++++++++++++++++++-----
> > > > >   hw/virtio/vhost-vdpa.c             |  40 ++++++++-
> > > > >   hw/virtio/trace-events             |   1 +
> > > > >   4 files changed, 152 insertions(+), 23 deletions(-)
> > > >
> > > >
> > > > Think hard about the whole logic. This is safe since qemu memory map
> > > > will fail if guest submits a invalidate IOVA.
> > > >
> > >
> > > Can you expand on this? What you mean is that VirtQueue already
> > > protects SVQ code if the guest sets an invalid buffer address (GPA),
> > > isn't it?
> >
> > Yes.
> >
> > >
> > > > Then I wonder if we do something much more simpler:
> > > >
> > > > 1) Using qemu VA as IOVA but only maps the VA that belongs to guest
> > > > 2) Then we don't need any IOVA tree here, what we need is to just map
> > > > vring and use qemu VA without any translation
> > > >
> > >
> > > That would be great, but either qemu's SVQ vring or guest translated
> > > buffers address (in qemu VA form) were already in high addresses,
> > > outside of the device's iova range (in my test).
> >
> > You're right. I miss that and that's why we need e.g iova tree and 
> > allocator.
> >
> > What I proposed only makes sense when shared virtual memory (SVA) is
> > implemented. In the case of SVA, the valid iova range should be the
> > full VA range.
> >
> > >
> > > I didn't try remapping tricks to make them fit in the range, but I
> > > think it does complicate the solution relatively fast if there was
> > > already memory in that range owned by qemu before enabling SVQ:
> > >
> > > * Guest memory must be contiguous in VA address space, but it "must"
> > > support hotplug/unplug (although vDPA currently pins it). Hotplug
> > > memory could always overlap with SVQ vring, so we would need to move
> > > it.
> > > * Duplicating mapped memory for writing? (Not sure if guest memory is
> > > actually movable in qemu).
> > > * Indirect descriptors will need to allocate and free memory more or
> > > less frequently, increasing the possibility of overlapping.
> >
> > I'm not sure I get the problem, but overlapping is not an issue since
> > we're using VA.
> >
>
> It's basically the same (potential) problem of DPDK's SVQ: IOVA Range
> goes from 0 to X. That means that both GPA and SVQ must be in IOVA
> range. As an example, we put GPA at the beginning of the range, that
> grows upwards when memory is hot plugged, and SVQ vrings that grows
> downwards when devices are added or set in SVQ mode.
>
> Even without both space fragmentation problems, we could reach a point
> where both will take the same address, and we would need to go to the
> tree.
>
> But since we are able to detect those situations, I can see how we can
> work in two modes as an optimization: 1:1 when they don't overlap, and
> fragmented tree where it does. But I don't think it's a good idea to
> include it from the beginning, and I'm not sure if that is worth it
> without measuring the  tree translation cost first.
>
> > >
> > > If we can move guest memory,
> >
> > I'm not sure we can do this or it looks very tricky.
> >
>
> Just thinking out loud here, but maybe we could map all memory and
> play with remap_file_pages [1] a little bit for that.
>
> > > however, I can see how we can track it in
> > > a tree *but* mark when the tree is 1:1 with qemu's VA, so buffers
> > > forwarding does not take the translation penalty. When guest memory
> > > cannot be map 1:1, we can resort to tree, and come back to 1:1
> > > translation if the offending tree node(s) get deleted.
> > >
> > > However I think this puts the solution a little bit farther than
> > > "starting simple" :).
> > >
> > > Does it make sense?
> >
> > Yes. So I think I will review the IOVA tree codes and get back to you.
> >
>
> Looking forward to it :).
>

PS: Actually, they still use the GArray solution as the previous
series. I'm currently migrating to use an actual tree and adding
allocation features to util/iova-tree, so maybe it is not worth
reviewing it at this moment. The public interface to it is the same,
but there is little to review there.

Thanks!

> Thanks!
>
> [1] https://linux.die.net/man/2/remap_file_pages
>
> > THanks
> >
> > >
> > > Thanks!
> > >
> > > > Thanks
> > > >
> > >
> >




reply via email to

[Prev in Thread] Current Thread [Next in Thread]