[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Qemu-devel] [PATCH 7/7] numa: Allow empty nodes
From: |
Nishanth Aravamudan |
Subject: |
Re: [Qemu-devel] [PATCH 7/7] numa: Allow empty nodes |
Date: |
Mon, 16 Jun 2014 17:21:04 -0700 |
User-agent: |
Mutt/1.5.21 (2010-09-15) |
On 16.06.2014 [17:31:24 -0300], Eduardo Habkost wrote:
> On Mon, Jun 16, 2014 at 05:11:40PM -0300, Eduardo Habkost wrote:
> [...]
> > Wait, is the node ID visible to the guest at all? I believe it is a
> > QEMU-internal thing, just to allow the NUMA nodes to be ordered in the
> > command-line. I would even claim that the parameter is useless and
> > shouldn't have been introduced in the first place.
> >
> > What I don't se is: why you need the command-line to look like:
> > -numa node,id=1,mem=X
> > when you can simply write it as:
> > -numa node,id=0 -numa node,id=1,mem=X
>
> Oh, I believe now I see it: the problem is not that you don't just need
> "memory-less nodes" (which could be simply defined explicitly like
> above), but that you need non-contiguous node IDs, which are visible to
> the guest.
Well, and for powerpc, at least, there is some changing that needs
needed (the prior patches) to get memory-less Node 0 supported.
> In this case, my example above with two -numa options would work, but it
> would be confusing as the user just wants one node with ID=1 (instead of
> two nodes).
Yep, exactly. This is something we see on PowerVM, and it tends to trip
up the kernel (or lead to corner cases) and it would be nice to be able
to emulate these topologies with KVM.
> So, now your patch makes sense to me. But we first need something to
> make sure the following command-line:
> -numa node,id=3 -numa node,id=2
> be different from:
> -numa node,id=0 -numa node,id=1 -numa node,id=2 -numa node,id=3
>
> The former should divide the memory in half, between nodes 1 and 2. The
> latter should divide the memory in four, between nodes 0, 1, and 2.
Agreed on both those counts, but I think there is another case to ensure
we get right as well:
-numa node,id=3,mem=0 -numa node,id=2
All of the memory for the guest should be on node 2, but node 3 should
exist and be memoryless. But how do you differentiate between the values
in node_mem that started out zero and those that are set to 0 by the
user's arguments.
My first guess is that node_mem needs to be turned into a signed array
and we stuff -1 in there (currently 0 is written). Then we know any
non-negative entries are those specified by the user? Dunno if we care
about the reduction in per-node maximum memory (down to 2^63 from 2^64)?
Perhaps the resolution is the same to dealing with your case. Just
making it explicit what nodes were passed by the user in node_mem, and
counting the total correctly, should let us do the right thing?
Thanks,
Nish
[Qemu-devel] [PATCH 4/7] spapr: Split memory nodes to power-of-two blocks, Alexey Kardashevskiy, 2014/06/16
[Qemu-devel] [PATCH 1/7] spapr: Move DT memory node rendering to a helper, Alexey Kardashevskiy, 2014/06/16
[Qemu-devel] [PATCH 3/7] spapr: Refactor spapr_populate_memory(), Alexey Kardashevskiy, 2014/06/16