qemu-trivial
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [PATCH] hw/arm/acpi: Pack the SRAT processors structure by node_id a


From: Zengtao (B)
Subject: RE: [PATCH] hw/arm/acpi: Pack the SRAT processors structure by node_id ascending order
Date: Tue, 14 Jan 2020 02:07:11 +0000

> -----Original Message-----
> From: Igor Mammedov [mailto:address@hidden]
> Sent: Monday, January 13, 2020 5:06 PM
> To: Zengtao (B)
> Cc: Peter Maydell; Michael S. Tsirkin; address@hidden;
> address@hidden; Shannon Zhao; address@hidden
> Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors structure
> by node_id ascending order
> 
> On Fri, 10 Jan 2020 02:56:35 +0000
> "Zengtao (B)" <address@hidden> wrote:
> 
> > > -----Original Message-----
> > > From: Igor Mammedov [mailto:address@hidden]
> > > Sent: Thursday, January 09, 2020 5:54 PM
> > > To: Zengtao (B)
> > > Cc: Michael S. Tsirkin; address@hidden;
> address@hidden;
> > > Shannon Zhao; Peter Maydell; address@hidden
> > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors
> structure by
> > > node_id ascending order
> > >
> > > On Thu, 9 Jan 2020 02:45:58 +0000
> > > "Zengtao (B)" <address@hidden> wrote:
> > >
> > > > > -----Original Message-----
> > > > > From: Igor Mammedov [mailto:address@hidden]
> > > > > Sent: Thursday, January 09, 2020 12:39 AM
> > > > > To: Zengtao (B)
> > > > > Cc: Michael S. Tsirkin; address@hidden;
> > > address@hidden;
> > > > > Shannon Zhao; Peter Maydell; address@hidden
> > > > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors
> structure
> > > by
> > > > > node_id ascending order
> > > > >
> > > > > On Wed, 8 Jan 2020 04:02:10 +0000
> > > > > "Zengtao (B)" <address@hidden> wrote:
> > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Igor Mammedov [mailto:address@hidden]
> > > > > > > Sent: Tuesday, January 07, 2020 11:50 PM
> > > > > > > To: Zengtao (B)
> > > > > > > Cc: Michael S. Tsirkin; address@hidden;
> > > > > address@hidden;
> > > > > > > Shannon Zhao; Peter Maydell; address@hidden
> > > > > > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT processors
> > > structure
> > > > > by
> > > > > > > node_id ascending order
> > > > > > >
> > > > > > > On Tue, 7 Jan 2020 10:29:22 +0000
> > > > > > > "Zengtao (B)" <address@hidden> wrote:
> > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Michael S. Tsirkin [mailto:address@hidden]
> > > > > > > > > Sent: Tuesday, January 07, 2020 5:33 PM
> > > > > > > > > To: Zengtao (B)
> > > > > > > > > Cc: address@hidden; address@hidden;
> > > Shannon
> > > > > > > Zhao;
> > > > > > > > > Peter Maydell; Igor Mammedov; address@hidden
> > > > > > > > > Subject: Re: [PATCH] hw/arm/acpi: Pack the SRAT
> processors
> > > > > structure
> > > > > > > by
> > > > > > > > > node_id ascending order
> > > > > > > > >
> > > > > > > > > On Tue, Jan 07, 2020 at 05:18:49PM +0800, Zeng Tao
> wrote:
> > > > > > > > > > When booting the guest linux with the following numa
> > > > > configuration:
> > > > > > > > > > -numa node,node_id=1,cpus=0-3
> > > > > > > > > > -numa node,node_id=0,cpus=4-7
> > > > > > > > > > We can get the following numa topology in the guest
> system:
> > > > > > > > > > Architecture:          aarch64
> > > > > > > > > > Byte Order:            Little Endian
> > > > > > > > > > CPU(s):                8
> > > > > > > > > > On-line CPU(s) list:   0-7
> > > > > > > > > > Thread(s) per core:    1
> > > > > > > > > > Core(s) per socket:    8
> > > > > > > > > > Socket(s):             1
> > > > > > > > > > NUMA node(s):          2
> > > > > > > > > > L1d cache:             unknown size
> > > > > > > > > > L1i cache:             unknown size
> > > > > > > > > > L2 cache:              unknown size
> > > > > > > > > > NUMA node0 CPU(s):     0-3
> > > > > > > > > > NUMA node1 CPU(s):     4-7
> > > > > > > > > > The Cpus 0-3 is assigned with NUMA node 1 in QEMU
> while it
> > > get
> > > > > > > NUMA
> > > > > > > > > node
> > > > > > > > > > 0 in the guest.
> > > > > > > > > >
> > > > > > > > > > In fact, In the linux kernel, numa_node_id is allocated
> per the
> > > > > ACPI
> > > > > > > > > > SRAT processors structure order,so the cpu 0 will be the
> first
> > > one
> > > > > to
> > > > > > > > > > allocate its NUMA node id, so it gets the NUMA node 0.
> > > > > > > > > >
> > > > > > > > > > To fix this issue, we pack the SRAT processors structure
> in
> > > numa
> > > > > node
> > > > > > > id
> > > > > > > > > > order but not the default cpu number order.
> > > > > > > > > >
> > > > > > > > > > Signed-off-by: Zeng Tao <address@hidden>
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Does this matter? If yes fixing linux to take node id from
> > > proximity
> > > > > > > > > field in ACPI seems cleaner ...
> > > > > > > > >
> > > > > > > >
> > > > > > > > In fact, I just want to align the node_id concept in QEMU
> and
> > > Linux.
> > > > > > > > If we fix the kernel side, we need to align with all platforms.
> > > > > > > > i think maybe not a good idea.
> > > > > > > if linux makes up node ID's on it's own, it would be hard for it
> to
> > > > > > > map SRAT entries to other tables that use proximity id as
> well.
> > > > > > >
> > > > > > > So it would need to maintain map of [proximity id] -> [node
> id] (and
> > > > > reverse)
> > > > > > > somewhere to resolve mappings on other tables.
> > > > > > > If it doesn't do this then it's broken and works just by
> accident,
> > > > > > > if it does the fix probably should be in that code and not in
> QEMU.
> > > > > > >
> > > > > >
> > > > > > Hmm, the problem is how to understand the concept of node
> id.
> > > > > > 1. In dts, there is node id. Both the QEMU and Linux can use it
> > > > > > directly, so no conflict.
> > > > > > 2. In ACPI, there is only proximity domain, but no node id, there
> > > > > >  should be a mapping between them, while kernel and QEMU
> > > maintain
> > > > > >  their own rule, and unfortunately they conflict with each
> other
> > > > > >  sometimes.
> > > > > >
> > > > > > There is no specification to indicate what we should do to
> maintain
> > > the
> > > > > > mapping, so it's hard to align the QEMU and Linux.
> > > > > >
> > > > > > Any suggestion, or we just accept it as a rule since it don't
> affect
> > > much?
> > > > >
> > > > > If node id generation is driven by SRAT content, it might be
> reasonable
> > > to
> > > > > ask for SRAT parser in kernel to create node ids using proximity
> value
> > > > > instead of the order ACPI_SRAT_PROCESSOR_GICC structures are
> > > > > enumerated.
> > > > > That way node id would match ACPI spec.
> > > > >
> > > >
> > > > I don't quite understand "That way node id would match ACPI
> spec."
> > > > I check the ACPI 6.3 spec, I didn't find any description that node id
> should
> > > > be equal to proximity value, in section 6.2.15, there is indeed an
> example
> > > > which node numbers equals to proximity value.
> > >
> > > There is only proximity id in spec (which conceptually is the same as
> node
> > > id).
> > > What I'm saying is that linux kernel ACPI parser could use it as node
> id
> > > instead of using counter for making up node ids.
> > >
> > > (issue with it is that proximity values could be sparse, so one
> > > would need to audit current node id users to make sure that using
> > > proximity id won't break anything)
> > >
> > I think that is really a problem, in linux kernel:
> > #if MAX_NUMNODEs > 256
> > #define MAX_PXM_DOMAINS MAX_NUMNODEs
> > #else
> > #define MAX_PXM_DOMAINS 256
> > #endif
> > It seems that using counter is a better choice.
> 
> In QEMU case nodeid isn't sparse. So even if kernel uses counter, it will
> work fine as far as it sorts SRAT by proximity value (same thing you are
> trying
> to do in this patch but on kernel side).
> That way it would match what HW vendors think about node numbering
> in most cases.
> 
> (We a comparing apples to oranges, so arguing for one case vs another is
> rather pointless)
> 
> Another question, does this patch fixes anything?
> (if it is not, it might be better to just drop this idea)
> 

:) I am not persist on this patch, but I think you have provide me an important
clue. "Most HW vendors number pxm and node in the same way, and so does QEMU", 
so i think it's reasonable for linux kernel to do the same.

So i think we can end the discussion and just ignore this patch.
Thanks very much.

Regards
Zengtao 



reply via email to

[Prev in Thread] Current Thread [Next in Thread]