Re: [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory

qemu-devel

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory

From:	Tao Xu
Subject:	Re: [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information
Date:	Wed, 14 Aug 2019 10:58:20 +0800
User-agent:	Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0

On 8/13/2019 11:11 PM, Eric Blake wrote:

On 8/9/19 1:57 AM, Tao wrote:

From: Liu Jingqi <address@hidden>

Add -numa hmat-lb option to provide System Locality Latency and
Bandwidth Information. These memory attributes help to build
System Locality Latency and Bandwidth Information Structure(s)
in ACPI Heterogeneous Memory Attribute Table (HMAT).

Signed-off-by: Liu Jingqi <address@hidden>
Signed-off-by: Tao Xu <address@hidden>
---

Changes in v9:
     - change the CLI input way, make it more user firendly (Daniel Black)
     use latency=NUM[p|n|u]s and bandwidth=NUM[M|G|P](B/s) as input and drop
     the base-lat and base-bw input.


Why are you hand-rolling yet another scaling parser instead of reusing
one that's already in-tree?

Because there are no time scaling parser and QMP 'size' type will use kbas default. It is a tricky issue because the entry in HMAT is small(max0xffff) and we need to store the unit in HMAT.


But as you mentioned blew, 'str' is not a good choice for QMP.
Therefore, what about this solution:

For bandwidth, reuse the qemu_strtosz_MiB() (because the smllest unit isMB/s). For latency, write a time scaling parser named as"qemu_strtotime_ps()" and "qemu_strtotime_ns()" in util/cutils.c. Andthen use it to pre-convert them into the single scale (QMP interface canuse).

At last, in HMAT, we auto store the data, separate it into the same baseunit and entry, and show error if overflow. Then the HMAT can support aslarge as possible.


I am wondering if this solution is OK.

+++ b/hw/core/numa.c

+void parse_numa_hmat_lb(MachineState *ms, NumaHmatLBOptions *node,
+                        Error **errp)
+{

+    if (node->has_latency) {
+        hmat_lb = ms->numa_state->hmat_lb[node->hierarchy][node->data_type];
+
+        if (!hmat_lb) {
+            hmat_lb = g_malloc0(sizeof(*hmat_lb));
+            ms->numa_state->hmat_lb[node->hierarchy][node->data_type] = 
hmat_lb;
+        } else if (hmat_lb->latency[node->initiator][node->target]) {
+            error_setg(errp, "Duplicate configuration of the latency for "
+                       "initiator=%" PRIu16 " and target=%" PRIu16 ".",
+                       node->initiator, node->target);
+            return;
+        }
+
+        ret = qemu_strtoui(node->latency, &endptr, 10, &latency);
+        if (ret < 0) {
+            error_setg(errp, "Invalid latency %s", node->latency);
+            return;
+        }
+
+        if (*endptr == '\0') {
+            base_lat = 1;
+        } else if (*(endptr + 1) == 's') {
+            switch (*endptr) {
+            case 'p':
+                base_lat = 1;
+                break;
+            case 'n':
+                base_lat = PICO_PER_NSEC;
+                break;
+            case 'u':
+                base_lat = PICO_PER_USEC;
+                break;


Hmm - this is a different scaling than any of our existing parsers
(which assume multiples k/M/G..., not subdivisions u/n/s)

+    if (node->has_bandwidth) {
+        hmat_lb = ms->numa_state->hmat_lb[node->hierarchy][node->data_type];
+
+        if (!hmat_lb) {
+            hmat_lb = g_malloc0(sizeof(*hmat_lb));
+            ms->numa_state->hmat_lb[node->hierarchy][node->data_type] = 
hmat_lb;
+        } else if (hmat_lb->bandwidth[node->initiator][node->target]) {
+            error_setg(errp, "Duplicate configuration of the bandwidth for "
+                       "initiator=%" PRIu16 " and target=%" PRIu16 ".",
+                       node->initiator, node->target);
+            return;
+        }
+
+        ret = qemu_strtoui(node->bandwidth, &endptr, 10, &bandwidth);
+        if (ret < 0) {
+            error_setg(errp, "Invalid bandwidth %s", node->bandwidth);
+            return;
+        }
+
+        switch (toupper(*endptr)) {
+        case '\0':
+        case 'M':
+            base_bw = 1;
+            break;
+        case 'G':
+            base_bw = UINT64_C(1) << 10;
+            break;
+        case 'P':
+            base_bw = UINT64_C(1) << 20;
+            break;


But this one, in addition to being wrong (P is 1<<30, not 1<<20), should
definitely be reusing qemu_strtosz_metric() or similar (look in
util/cutils.c).

+++ b/qapi/machine.json
@@ -377,10 +377,12 @@
  #
  # @cpu: property based CPU(s) to node mapping (Since: 2.10)
  #
+# @hmat-lb: memory latency and bandwidth information (Since: 4.2)
+#
  # Since: 2.1
  ##
  { 'enum': 'NumaOptionsType',
-  'data': [ 'node', 'dist', 'cpu' ] }
+  'data': [ 'node', 'dist', 'cpu', 'hmat-lb' ] }

+##
+# @HmatLBDataType:
+#
+# Data type in the System Locality Latency
+# and Bandwidth Information Structure of HMAT (Heterogeneous
+# Memory Attribute Table)
+#
+# For more information of @HmatLBDataType see
+# the chapter 5.2.27.4: Table 5-142:  Field "Data Type" of ACPI 6.3 spec.
+#
+# @access-latency: access latency (picoseconds)
+#
+# @read-latency: read latency (picoseconds)
+#
+# @write-latency: write latency (picoseconds)
+#
+# @access-bandwidth: access bandwidth (MB/s)
+#
+# @read-bandwidth: read bandwidth (MB/s)
+#
+# @write-bandwidth: write bandwidth (MB/s)


Are these really the best scales?

+
+##
+# @NumaHmatLBOptions:
+#
+# Set the system locality latency and bandwidth information
+# between Initiator and Target proximity Domains.
+#
+# For more information of @NumaHmatLBOptions see
+# the chapter 5.2.27.4: Table 5-142 of ACPI 6.3 spec.
+#
+# @initiator: the Initiator Proximity Domain.
+#
+# @target: the Target Proximity Domain.
+#
+# @hierarchy: the Memory Hierarchy. Indicates the performance
+#             of memory or side cache.
+#
+# @data-type: presents the type of data, access/read/write
+#             latency or hit latency.
+#
+# @latency: the value of latency from @initiator to @target proximity domain,
+#           the latency units are "ps(picosecond)", "ns(nanosecond)" or
+#           "us(microsecond)".
+#
+# @bandwidth: the value of bandwidth between @initiator and @target proximity
+#             domain, the bandwidth units are "MB(/s)","GB(/s)" or "PB(/s)".
+#
+# Since: 4.2
+##
+{ 'struct': 'NumaHmatLBOptions',
+    'data': {
+    'initiator': 'uint16',
+    'target': 'uint16',
+    'hierarchy': 'HmatLBMemoryHierarchy',
+    'data-type': 'HmatLBDataType',
+    '*latency': 'str',
+    '*bandwidth': 'str' }}


...and then parsing strings instead of taking raw integers?  Parsing
strings is okay for HMP, but for QMP, our goal should be a single
representation with no additional sugar on top.  Latency and bandwidth
should be int in a single scale.

+++ b/qemu-options.hx
@@ -164,16 +164,19 @@ DEF("numa", HAS_ARG, QEMU_OPTION_numa,
      "-numa 
node[,mem=size][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
      "-numa 
node[,memdev=id][,cpus=firstcpu[-lastcpu]][,nodeid=node][,initiator=node]\n"
      "-numa dist,src=source,dst=destination,val=distance\n"
-    "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n",
+    "-numa cpu,node-id=node[,socket-id=x][,core-id=y][,thread-id=z]\n"
+    "-numa 
hmat-lb,initiator=node,target=node,hierarchy=memory|first-level|second-level|third-level,data-type=access-latency|read-latency|write-latency[,latency=lat][,bandwidth=bw]\n",


Command-line parsing can then take human-written scaled numbers, and
pre-convert them into the single scale accepted by the QMP interface.

[Prev in Thread]

Current Thread

[Next in Thread]

Re: [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes, (continued)
- [Qemu-devel] [PATCH v9 06/11] hmat acpi: Build Memory Proximity Domain Attributes Structure(s), Tao, 2019/08/09
- [Qemu-devel] [PATCH v9 07/11] hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s), Tao, 2019/08/09
- [Qemu-devel] [PATCH v9 08/11] hmat acpi: Build Memory Side Cache Information Structure(s), Tao, 2019/08/09
- [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information, Tao, 2019/08/09
  - Re: [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information, Daniel Black, 2019/08/12
    - Re: [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information, Tao Xu, 2019/08/12
  - Re: [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information, Eric Blake, 2019/08/13
    - Re: [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information, Tao Xu <=
- [Qemu-devel] [PATCH v9 10/11] numa: Extend the CLI to provide memory side cache information, Tao, 2019/08/09
- [Qemu-devel] [PATCH v9 11/11] tests/bios-tables-test: add test cases for ACPI HMAT, Tao, 2019/08/09
- Re: [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT), no-reply, 2019/08/09
- Re: [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT), Tao Xu, 2019/08/13
  - Re: [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT), Eduardo Habkost, 2019/08/14
    - Re: [Qemu-devel] [PATCH v9 00/11] Build ACPI Heterogeneous Memory Attribute Table (HMAT), Tao Xu, 2019/08/14

Prev by Date: Re: [Qemu-devel] [PATCH v9 05/11] numa: Extend CLI to provide initiator information for numa nodes
Next by Date: Re: [Qemu-devel] [PATCH] spapr/xive: Mask the EAS when allocating an IRQ
Previous by thread: Re: [Qemu-devel] [PATCH v9 09/11] numa: Extend the CLI to provide memory latency and bandwidth information
Next by thread: [Qemu-devel] [PATCH v9 10/11] numa: Extend the CLI to provide memory side cache information
Index(es):
- Date
- Thread