qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: RFC: New device for zero-copy VM memory access


From: geoff
Subject: Re: RFC: New device for zero-copy VM memory access
Date: Fri, 01 Nov 2019 01:18:24 +1100
User-agent: Roundcube Webmail/1.2.3



On 2019-11-01 00:24, Dr. David Alan Gilbert wrote:
* address@hidden (address@hidden) wrote:
Hi Dave,

On 2019-10-31 05:52, Dr. David Alan Gilbert wrote:
> * address@hidden (address@hidden) wrote:
> > Hi All,
> >
> > Over the past week, I have been working to come up with a solution
> > to the
> > memory transfer performance issues that hinder the Looking Glass
> > Project.
> >
> > Currently Looking Glass works by using the IVSHMEM shared memory
> > device
> > which
> > is fed by an application that captures the guest's video output.
> > While this
> > works it is sub-optimal because we first have to perform a CPU copy
> > of the
> > captured frame into shared RAM, and then back out again for display.
> > Because
> > the destination buffers are allocated by closed proprietary code
> > (DirectX,
> > or
> > NVidia NvFBC) there is no way to have the frame placed directly into
> > the
> > IVSHMEM shared ram.
> >
> > This new device, currently named `introspection` (which needs a more
> > suitable
> > name, porthole perhaps?), provides a means of translating guest
> > physical
> > addresses to host virtual addresses, and finally to the host offsets
> > in RAM
> > for
> > file-backed memory guests. It does this by means of a simple
> > protocol over a
> > unix socket (chardev) which is supplied the appropriate fd for the
> > VM's
> > system
> > RAM. The guest (in this case, Windows), when presented with the
> > address of a
> > userspace buffer and size, will mlock the appropriate pages into RAM
> > and
> > pass
> > guest physical addresses to the virtual device.
>
> Hi Geroggrey,
>   I wonder if the same thing can be done by using the existing
> vhost-user
> mechanism.
>
>   vhost-user is intended for implementing a virtio device outside of the
> qemu process; so it has a character device that qemu passes commands
> down
> to the other process, where qemu mostly passes commands via the virtio
> queues.   To be able to read the virtio queues, the external process
> mmap's the same memory as the guest - it gets passed a 'set mem table'
> command by qemu that includes fd's for the RAM, and includes base/offset
> pairs saying that a particular chunk of RAM is mapped at a particular
> guest physical address.
>
>   Whether or not you make use of virtio queues, I think the mechanism
> for the device to tell the external process the mappings might be what
> you're after.
>
> Dave
>

While normally I would be all for re-using such code, the vhost-user while
being very feature-complete from what I understand is overkill for our
requirements. It will still allocate a communication ring and an events
system
that we will not be using. The goal of this device is to provide a dumb & simple method of sharing system ram, both for this project and for others
that
work on a simple polling mechanism, it is not intended to be an end-to-end
solution like vhost-user is.

If you still believe that vhost-user should be used, I will do what I can to
implement it, but for such a simple device I honestly believe it is
overkill.

It's certainly worth having a look at vhost-user even if you don't use
most of it;  you can configure it down to 1 (maybe 0?) queues if you're
really desperate - and you might find it comes in useful!  The actual
setup is pretty easy.

The process of synchronising with (potentially changing) host memory
mapping is a bit hairy; so if we can share it with vhost it's probably
worth it.

Thanks, I will have a deeper dive into it, however the issues with
changing host memory, migration, and all that extra is of no concern or
use to us.

The audience that will be using this interface is not interested in
such features as the primary reason for Looking Glass is to allow a for
high-performance windows workstation for gaming and proprietary windows
only software. In these scenarios features like ram ballooning are
avoided like the plague as it hampers performance for use cases that
require consistent low latency for competitive gameplay.

As the author of Looking Glass, I also have to consider the maintenance
and the complexity of implementing the vhost protocol into the project.
At this time a complete Porthole client can be implemented in 150 lines
of C without external dependencies, and most of that is boilerplate
socket code. This IMO is a major factor in deciding to avoid vhost-user.

I also have to weigh up the cost of developing and maintaining the
windows driver for this device. I am very green when it comes to Windows
driver programming, it took weeks to write the first IVSHMEM driver, and
several days to write the Porthole driver which is far simpler than the
IVSHMEM driver. I'd hate to think of the time investment in maintaining
the vhost integration also (yes, I am aware there is a library).

These drivers are not complex and I am sure an experienced windows
driver developer could have thrown them together in a few hours, but
since our requirements are so niche and of little commercial value those
in our community that are using this project do not have the time and/or
ability to assist with the drivers.

From my point of view, avoiding vhost-use seems like a better path to
take as I am able (and willing) to maintain the Porthole device in QEMU,
the OS drivers, and client interface. The Porthole device also doesn't
have any special or complex features keeping it very simple to maintain
and keeping the client protocol very simple.

There is also an open-source audio driver for windows called SCREAM
that was initially designed for broadcasting audio over a network,
however, it's authors have also implemented transport via shared RAM.
While vhost-user would make much more sense here as vring buffers
would be very useful, the barrier to entry is too high and as such the
developers have instead opted to use the simple IVSHMEM device instead.

That said, I will still have a deeper look into vhost-user but I hope
the above shows the merits of this simple method of guest ram access.

-Geoff


Dave

-Geoff

> > This device and the windows driver have been designed in such a way
> > that
> > it's a
> > utility device for any project and/or application that could make
> > use of it.
> > The PCI subsystem vendor and device ID are used to provide a means
> > of device
> > identification in cases where multiple devices may be in use for
> > differing
> > applications. This also allows one common driver to be used for any
> > other
> > projects wishing to build on this device.
> >
> > My ultimate goal is to get this to a state where it could be accepted
> > upstream
> > into Qemu at which point Looking Glass would be modified to use it
> > instead
> > of
> > the IVSHMEM device.
> >
> > My git repository with the new device can be found at:
> > https://github.com/gnif/qemu
> >
> > The new device is:
> > https://github.com/gnif/qemu/blob/master/hw/misc/introspection.c
> >
> > Looking Glass:
> > https://looking-glass.hostfission.com/
> >
> > The windows driver, while working, needs some cleanup before the
> > source is
> > published. I intend to maintain both this device and the windows
> > driver
> > including producing a signed Windows 10 driver if Redhat are
> > unwilling or
> > unable.
> >
> > Kind Regards,
> > Geoffrey McRae
> >
> > HostFission
> > https://hostfission.com
> >
> --
> Dr. David Alan Gilbert / address@hidden / Manchester, UK
--
Dr. David Alan Gilbert / address@hidden / Manchester, UK



reply via email to

[Prev in Thread] Current Thread [Next in Thread]