[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86
From: |
Zhao Liu |
Subject: |
Re: [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86State |
Date: |
Fri, 15 Sep 2023 15:50:14 +0800 |
Hi Philippe,
On Thu, Sep 14, 2023 at 09:38:46AM +0200, Philippe Mathieu-Daudé wrote:
> Date: Thu, 14 Sep 2023 09:38:46 +0200
> From: Philippe Mathieu-Daudé <philmd@linaro.org>
> Subject: Re: [PATCH v4 10/21] i386: Introduce module-level cpu topology to
> CPUX86State
>
> On 14/9/23 09:21, Zhao Liu wrote:
> > From: Zhuocheng Ding <zhuocheng.ding@intel.com>
> >
> > smp command has the "clusters" parameter but x86 hasn't supported that
> > level. "cluster" is a CPU topology level concept above cores, in which
> > the cores may share some resources (L2 cache or some others like L3
> > cache tags, depending on the Archs) [1][2]. For x86, the resource shared
> > by cores at the cluster level is mainly the L2 cache.
> >
> > However, using cluster to define x86's L2 cache topology will cause the
> > compatibility problem:
> >
> > Currently, x86 defaults that the L2 cache is shared in one core, which
> > actually implies a default setting "cores per L2 cache is 1" and
> > therefore implicitly defaults to having as many L2 caches as cores.
> >
> > For example (i386 PC machine):
> > -smp 16,sockets=2,dies=2,cores=2,threads=2,maxcpus=16 (*)
> >
> > Considering the topology of the L2 cache, this (*) implicitly means "1
> > core per L2 cache" and "2 L2 caches per die".
> >
> > If we use cluster to configure L2 cache topology with the new default
> > setting "clusters per L2 cache is 1", the above semantics will change
> > to "2 cores per cluster" and "1 cluster per L2 cache", that is, "2
> > cores per L2 cache".
> >
> > So the same command (*) will cause changes in the L2 cache topology,
> > further affecting the performance of the virtual machine.
> >
> > Therefore, x86 should only treat cluster as a cpu topology level and
> > avoid using it to change L2 cache by default for compatibility.
> >
> > "cluster" in smp is the CPU topology level which is between "core" and
> > die.
> >
> > For x86, the "cluster" in smp is corresponding to the module level [2],
> > which is above the core level. So use the "module" other than "cluster"
> > in i386 code.
> >
> > And please note that x86 already has a cpu topology level also named
> > "cluster" [3], this level is at the upper level of the package. Here,
> > the cluster in x86 cpu topology is completely different from the
> > "clusters" as the smp parameter. After the module level is introduced,
> > the cluster as the smp parameter will actually refer to the module level
> > of x86.
> >
> > [1]: 864c3b5c32f0 ("hw/core/machine: Introduce CPU cluster topology
> > support")
> > [2]: Yanan's comment about "cluster",
> > https://lists.gnu.org/archive/html/qemu-devel/2023-02/msg04051.html
> > [3]: SDM, vol.3, ch.9, 9.9.1 Hierarchical Mapping of Shared Resources.
> >
> > Signed-off-by: Zhuocheng Ding <zhuocheng.ding@intel.com>
> > Co-developed-by: Zhao Liu <zhao1.liu@intel.com>
> > Signed-off-by: Zhao Liu <zhao1.liu@intel.com>
> > Acked-by: Michael S. Tsirkin <mst@redhat.com>
> > ---
> > Changes since v1:
> > * The background of the introduction of the "cluster" parameter and its
> > exact meaning were revised according to Yanan's explanation. (Yanan)
> > ---
> > hw/i386/x86.c | 1 +
> > target/i386/cpu.c | 1 +
> > target/i386/cpu.h | 5 +++++
> > 3 files changed, 7 insertions(+)
>
>
> > diff --git a/target/i386/cpu.h b/target/i386/cpu.h
> > index 470257b92240..556e80f29764 100644
> > --- a/target/i386/cpu.h
> > +++ b/target/i386/cpu.h
> > @@ -1903,6 +1903,11 @@ typedef struct CPUArchState {
> > /* Number of dies within this CPU package. */
> > unsigned nr_dies;
> > + /*
> > + * Number of modules within this CPU package.
> > + * Module level in x86 cpu topology is corresponding to smp.clusters.
> > + */
> > + unsigned nr_modules;
> > } CPUX86State;
>
> It would be really useful to have an ASCII art comment showing
> the architecture topology.
Good idea, I could consider how to show that.
> Also for clarity the topo fields from
> CPU[Arch]State could be moved into a 'topo' sub structure, or even
> clearer would be to re-use the X86CPUTopoIDs structure?
Yeah, I also have the plan to do further cleanup about these topology
structures [1]. X86CPUTopoInfo is not suitable for hybrid topology case,
(hybrid case needs another structure to store the max number elements
for each level), so I will try to move that X86CPUTopoInfo into
CPU[Arch]State.
[1]: https://lists.gnu.org/archive/html/qemu-devel/2023-08/msg01032.html
Thanks,
Zhao
- [PATCH v4 01/21] i386: Fix comment style in topology.h, (continued)
- [PATCH v4 01/21] i386: Fix comment style in topology.h, Zhao Liu, 2023/09/14
- [PATCH v4 04/21] hw/cpu: Update the comments of nr_cores and nr_dies, Zhao Liu, 2023/09/14
- [PATCH v4 05/21] i386/cpu: Fix i/d-cache topology to core level for Intel CPU, Zhao Liu, 2023/09/14
- [PATCH v4 07/21] i386/cpu: Consolidate the use of topo_info in cpu_x86_cpuid(), Zhao Liu, 2023/09/14
- [PATCH v4 08/21] i386: Split topology types of CPUID[0x1F] from the definitions of CPUID[0xB], Zhao Liu, 2023/09/14
- [PATCH v4 06/21] i386/cpu: Use APIC ID offset to encode cache topo in CPUID[4], Zhao Liu, 2023/09/14
- [PATCH v4 10/21] i386: Introduce module-level cpu topology to CPUX86State, Zhao Liu, 2023/09/14
- [PATCH v4 11/21] i386: Support modules_per_die in X86CPUTopoInfo, Zhao Liu, 2023/09/14
- [PATCH v4 09/21] i386: Decouple CPUID[0x1F] subleaf with specific topology level, Zhao Liu, 2023/09/14
- [PATCH v4 12/21] i386: Expose module level in CPUID[0x1F], Zhao Liu, 2023/09/14
- [PATCH v4 13/21] i386: Support module_id in X86CPUTopoIDs, Zhao Liu, 2023/09/14
- [PATCH v4 16/21] hw/i386/pc: Support smp.clusters for x86 PC machine, Zhao Liu, 2023/09/14
- [PATCH v4 14/21] i386/cpu: Introduce cluster-id to X86CPU, Zhao Liu, 2023/09/14
- [PATCH v4 15/21] tests: Add test case of APIC ID for module level parsing, Zhao Liu, 2023/09/14
- [PATCH v4 17/21] i386: Add cache topology info in CPUCacheInfo, Zhao Liu, 2023/09/14
- [PATCH v4 18/21] i386: Use CPUCacheInfo.share_level to encode CPUID[4], Zhao Liu, 2023/09/14
- [PATCH v4 19/21] i386: Use offsets get NumSharingCache for CPUID[0x8000001D].EAX[bits 25:14], Zhao Liu, 2023/09/14