qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 0/6] UFFD write-tracking migration/snapshots


From: Peter Xu
Subject: Re: [PATCH v4 0/6] UFFD write-tracking migration/snapshots
Date: Tue, 1 Dec 2020 13:54:38 -0500

On Tue, Dec 01, 2020 at 02:24:12PM +0300, Andrey Gruzdev wrote:
> On 01.12.2020 13:53, Peter Krempa wrote:
> > On Tue, Dec 01, 2020 at 11:42:18 +0300, Andrey Gruzdev wrote:
> > > On 01.12.2020 10:08, Peter Krempa wrote:
> > > > On Thu, Nov 26, 2020 at 18:17:28 +0300, Andrey Gruzdev via wrote:
> > > > > This patch series is a kind of 'rethinking' of Denis Plotnikov's 
> > > > > ideas he's
> > 
> > [...]
> > 
> > > > Note that in cases when qemu can't guarantee that the
> > > > background_snapshot feature will work it should not advertise it. We
> > > > need a way to check whether it's possible to use it, so we can replace
> > > > the existing --live flag with it rather than adding a new one and
> > > > shifting the problem of checking whether the feature works to the user.

Would it be fine if libvirt just try the new way first anyways?  Since if it
will fail, it'll fail right away on any unsupported memory types, then
logically the libvirt user may not even notice we've retried.

Previously I thought it was enough, because so far the kernel does not have a
specific flag showing whether such type of memory is supported.  But I don't
know whether it would be non-trivial for libvirt to retry like that.

Another solution is to let qemu test the uffd ioctls right after QEMU memory
setup, so we know whether background/live snapshot is supported or not with
current memory backends.  We should need to try this for every ramblock because
I think we can have different types across all the qemu ramblocks.

> > > > 
> > > 
> > > Hi,
> > > 
> > > May be you are using hugetlbfs as memory backend?
> > 
> > Not exactly hugepages, but I had:
> > 
> >    <memoryBacking>
> >      <access mode='shared'/>
> >    </memoryBacking>
> > 
> > which resulted into the following commandline to instantiate memory:
> > 
> > -object 
> > memory-backend-file,id=pc.ram,mem-path=/var/lib/libvirt/qemu/ram/6-upstream-bj/pc.ram,share=yes,size=33554432000,host-nodes=0,policy=bind
> >  \
> > 
> > When I've removed it I got:
> > 
> > -object 
> > memory-backend-ram,id=pc.ram,size=33554432000,host-nodes=0,policy=bind \
> > 
> > And the migration didn't fail in my quick test. I'll have a more
> > detailed look later, thanks for the pointer.
> > 
> 
> Yep, seems that current userfaultfd supports hugetlbfs and shared memory for
> missing pages but not for wr-protected..

Correct.  Btw, I'm working on both of them recently.  I have a testing kernel
branch, but I don't think it should affect our qemu work, though, since qemu
should do the same irrelevant of the memory type.  We can just test with
anonymous memories, and as long as it works, it should work perfectly on all
the rest of backends (maybe even for other file-backed memory, more below).

> 
> > > I totally agree that we need somehow check that kernel and VM memory 
> > > backend
> > > support the feature before one can enable the capability.
> > > Need to think about that..
> > 
> > Definitely. Also note that memory backed by memory-backend-file will be
> > more and more common, for cases such as virtiofs DAX sharing and
> > similar.
> > 
> 
> I see.. That needs support from kernel side, so far 'background-snapshots'
> are incompatible with memory-backend-file sharing.

Yes.  So as mentioned, shmem/hugetlbfs should be WIP, but I haven't thought
about the rest yet.  Maybe... it's not hard to add uffd-wp for most of the
file-backed memory?  Since afaict the kernel handles wr-protect in a quite
straightforward way (do_wp_page() for whatever backend), and uffd-wp can be the
first to trap all of them.  I'm not sure whether Andrea has thought about that
or even on how to spread the missing usage to more types of backends (maybe
missing is more special in that it needs to consider page caches).  So I'm
copying Andrea too just in case there's further input.

Thanks,

-- 
Peter Xu




reply via email to

[Prev in Thread] Current Thread [Next in Thread]