qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type


From: Michael S. Tsirkin
Subject: Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type
Date: Fri, 26 Jul 2019 07:10:04 -0400

On Fri, Jul 26, 2019 at 09:57:51AM +0200, Paolo Bonzini wrote:
> On 25/07/19 22:30, Michael S. Tsirkin wrote:
> > On Thu, Jul 25, 2019 at 05:35:01PM +0200, Paolo Bonzini wrote:
> >> On 25/07/19 16:46, Michael S. Tsirkin wrote:
> >>> Actually, I think I have a better idea.
> >>> At the moment we just get an exit on these reads and return all-ones.
> >>> Yes, in theory there could be a UR bit set in a bunch of
> >>> registers but in practice no one cares about these,
> >>> and I don't think we implement them.
> >>> So how about mapping a single page, read-only, and filling it
> >>> with all-ones?
> >>
> >> Yes, that's nice indeed. :)  But it does have some cost, in terms of
> >> either number of VMAs or QEMU RSS since the MMCONFIG area is large.
> >>
> >> What breaks if we return all zeroes?  Zero is not a valid vendor ID.
> >>
> >> Paolo
> > 
> > I think I know what you are thinking of doing:
> > map /dev/zero so we get a single VMA but all mapped to
> > a single zero pte?
> 
> Yes, exactly.  You absolutely need to share the page because the guest
> could easily touch 32*256 pages just to scan function 0 on every bus and
> device, even if the VM has just 4 or 5 devices and all of them on the
> root complex.  And that causes fragmentation so you have to map bigger
> areas.
> 
> > - we can implement /dev/ones. in fact, we can implement
> >   /dev/byteXX for each possible value, the cost will
> >   be only 1M on a 4k page system.
> >   it might come in handy for e.g. free page hinting:
> >   at the moment if guest memory is poisoned
> >   we can not unmap it, with this trick we can
> >   map it to /dev/byteXX.
> 
> I also thought of /dev/ones, not sure how it would be accepted. :)  Also
> you cannot map lazily on page fault, otherwise you get a vmexit and it's
> slow again.  So /dev/ones needs to be written to use a huge page, possibly.
> 
> Paolo

It's not easy to do that - each device gets 4K within MCFG.

So what we need then is a kvm option to create an address range - or
maybe even a group of address ranges and aggressively map all pages in a
group to the same guest page on a fault of one page in the group.

-- 
MST



reply via email to

[Prev in Thread] Current Thread [Next in Thread]