qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 21/31] util: Add iova_tree_alloc


From: Eugenio Perez Martin
Subject: Re: [PATCH 21/31] util: Add iova_tree_alloc
Date: Fri, 28 Jan 2022 08:48:31 +0100

On Fri, Jan 28, 2022 at 6:56 AM Jason Wang <jasowang@redhat.com> wrote:
>
>
> 在 2022/1/28 上午11:57, Peter Xu 写道:
> > On Thu, Jan 27, 2022 at 10:24:27AM +0100, Eugenio Perez Martin wrote:
> >> On Thu, Jan 27, 2022 at 9:06 AM Peter Xu <peterx@redhat.com> wrote:
> >>> On Tue, Jan 25, 2022 at 10:40:01AM +0100, Eugenio Perez Martin wrote:
> >>>> So I think that the first step to remove complexity from the old one
> >>>> is to remove iova_begin and iova_end.
> >>>>
> >>>> As Jason points out, removing iova_end is easier. It has the drawback
> >>>> of having to traverse all the list beyond iova_end, but a well formed
> >>>> iova tree should contain none. If the guest can manipulate it, it's
> >>>> only hurting itself adding nodes to it.
> >>>>
> >>>> It's possible to extract the check for hole_right (or this in Jason's
> >>>> proposal) as a special case too.
> >>>>
> >>>> But removing the iova_begin parameter is more complicated. We cannot
> >>>> know if it's a valid hole without knowing iova_begin, and we cannot
> >>>> resume traversing. Could we assume iova_begin will always be 0? I
> >>>> think not, the vdpa device can return anything through syscall.
> >>> Frankly I don't know what's the syscall you're talking about,
> >> I meant VHOST_VDPA_GET_IOVA_RANGE, which allows qemu to know the valid
> >> range of iova addresses. We get a pair of uint64_t from it, that
> >> indicates the minimum and maximum iova address the device (or iommu)
> >> supports.
> >>
> >> We must allocate iova ranges within that address range, which
> >> complicates this algorithm a little bit. Since the SVQ iova addresses
> >> are not GPA, qemu needs extra code to be able to allocate and free
> >> them, creating a new custom iova as.
> >>
> >> Please let me know if you want more details or if you prefer me to
> >> give more context in the patch message.
> > That's good enough, thanks.
> >
> >>> I mean this one:
> >>>
> >>> https://lore.kernel.org/qemu-devel/20211029183525.1776416-24-eperezma@redhat.com/
> >>>
> >>> Though this time I have some comments on the details.
> >>>
> >>> Personally I like that one (probably with some amendment upon the old 
> >>> version)
> >>> more than the current list-based approach.  But I'd like to know your 
> >>> thoughts
> >>> too (including Jason).  I'll further comment in that thread soon.
> >>>
> >> Sure, I'm fine with whatever solution we choose, but I'm just running
> >> out of ideas to simplify it. Reading your suggestions on old RFC now.
> >>
> >> Overall I feel list-based one is both more convenient and easy to
> >> delete when qemu raises the minimal glib version, but it adds a lot
> >> more code.
> >>
> >> It could add less code with this less elegant changes:
> >> * If we just put the list entry in the DMAMap itself, although it
> >> exposes unneeded implementation details.
> >> * We force the iova tree either to be an allocation-based or an
> >> insertion-based, but not both. In other words, you can only either use
> >> iova_tree_alloc or iova_tree_insert on the same tree.
>
>
> This seems an odd API I must say :(
>
>
> > Yeah, I just noticed it yesterday that there's no easy choice on it.  Let's 
> > go
> > with either way; it shouldn't block the rest of the code.  It'll be good if
> > Jason or Michael share their preferences too.
>
>
> (Havne't gone through the code deeply)
>
> I wonder how about just copy-paste gtree_node_first|last()? A quick
> google told me it's not complicated.
>

Both GTree and GTreeNode definitions are not part of the ABI of glib.
I think the ustream code has not changed its layout for a long time
but still it's allowed to do so in the future.

Having said that, they use a list internally to traverse the nodes,
with node->left and node->right instead of TAILQ entries. These
patches duplicate that intrusive list in DMAMap entries, and make them
private so other parts of qemu are not affected.

Thanks!

> Thanks
>
>
> >
> >> I have a few tests to check the algorithms, but they are not in the
> >> qemu test format. I will post them so we all can understand better
> >> what is expected from this.
> > Sure.  Thanks.
> >
>




reply via email to

[Prev in Thread] Current Thread [Next in Thread]