qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v1 3/4] hw/arm/virt-acpi-build: patch guest SRAT for NUMA nod


From: Jonathan Cameron
Subject: Re: [PATCH v1 3/4] hw/arm/virt-acpi-build: patch guest SRAT for NUMA nodes
Date: Mon, 25 Sep 2023 14:54:40 +0100

On Fri, 22 Sep 2023 05:49:46 +0000
Ankit Agrawal <ankita@nvidia.com> wrote:

> Hi Jonathan

Hi Ankit,

> 
> > > +        if (pcidev->pdev.has_coherent_memory) {
> > > +            uint64_t start_node = object_property_get_uint(obj,
> > > +                                  "dev_mem_pxm_start", &error_abort);
> > > +            uint64_t node_count = object_property_get_uint(obj,
> > > +                                  "dev_mem_pxm_count", &error_abort);
> > > +            uint64_t node_index;
> > > +
> > > +            /*
> > > +             * Add the node_count PXM domains starting from start_node as
> > > +             * hot pluggable. The VM kernel parse the PXM domains and
> > > +             * creates NUMA nodes.
> > > +             */
> > > +            for (node_index = 0; node_index < node_count; node_index++)
> > > +                build_srat_memory(table_data, 0, 0, start_node + 
> > > node_index,
> > > +                    MEM_AFFINITY_ENABLED |
> > > + MEM_AFFINITY_HOTPLUGGABLE);  
> > 
> > 0 size SRAT entries for memory? That's not valid.  
> 
> Can you explain in what sense are these invalid? The Linux kernel accepts
> such setting and I had tested it.

ACPI specification doesn't define any means of 'updating' the memory range,
so whilst I guess they are not specifically disallowed without a spec definition
of what it means this is walking into a mine field. In particular the 
description of the hot pluggable bit worries me:
"The system hardware supports hot-add and hot-remove of this memory region."
So I think your definition is calling out that you can hot plug memory into
a region of zero size. To me that's nonsensical so a paranoid OS writer
might just spit out firmware error message and refuse to boot.

There is no guarantee other operating systems won't blow up if they see one
of these. To be able to do this safely I think you probably need an ACPI
spec update to say what such a zero length, zero base region means.

Possible the ASWG folk would say this is fine and I'm reading too much into
the spec, but I'd definitely suggest asking them via the appropriate path,
or throwing in a code first proposal for a comment on this special case and
see what response you get - my guess is it will be 'fix Linux' :(

> 
> > Seems like you've run into the same issue CXL has with dynamic addition of
> > nodes to the kernel and all you want to do here is make sure it thinks 
> > there are
> > enough nodes so initializes various structures large enough.
> >  
> Yes, exactly.
> 




reply via email to

[Prev in Thread] Current Thread [Next in Thread]