qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v5 09/10] x68: acpi: trigger SMI before sending hotplug Notif


From: Laszlo Ersek
Subject: Re: [PATCH v5 09/10] x68: acpi: trigger SMI before sending hotplug Notify event to OSPM
Date: Mon, 7 Sep 2020 17:17:52 +0200

Hi Igor,

On 09/07/20 13:23, Igor Mammedov wrote:
> In case firmware has negotiated CPU hotplug SMI feature, generate
> AML to describe SMI IO port region and send SMI to firmware
> on each CPU hotplug SCI in case new CPUs were hotplugged.
>
> Since new CPUs can be hotplugged while CPU_SCAN_METHOD is running
> we can't send SMI before new CPUs are fetched from QEMU as it
> could cause sending Notify to a CPU that firmware hasn't seen yet.
> So fetch new CPUs into local cache first, then send SMI and
> after that send Notify events to cached CPUs. This should ensure
> that Notify is sent only to CPUs which were processed by firmware
> first.
> Any CPUs that were hotplugged after caching will be processed
> by the next CPU_SCAN_METHOD, when pending SCI is handled.
>
> Signed-off-by: Igor Mammedov <imammedo@redhat.com>
> ---
> v5:
>  - fix hotplug on Windows when there is more than 256 possible CPUs
>    (Windows isn't able to handle VarPackage over 255 elements
>     so process CPUs in batches)
>  - (Laszlo Ersek <lersek@redhat.com>)
>    - fix off-by-one in package length
>    - fix not selecting CPU before clearing insert event
>    - use aml_lgreater() instead of aml_lnot(aml_equal(num_added_cpus,
>      zero)
>    - split off in separate patches:
>      - making smi_negotiated_features a property
>      - introduction of AcpiPmInfo.smi_on_cpuhp
>      - introduction of PCI0.SMI0 ACPI device
> v2:
>  - clear insert event after firmware has returned
>    control from SMI. (Laszlo Ersek <lersek@redhat.com>)
> v1:
>  - make sure that Notify is sent only to CPUs seen by FW
>  - (Laszlo Ersek <lersek@redhat.com>)
>     - use macro instead of x-smi-negotiated-features
>     - style fixes
>     - reserve whole SMI block (0xB2, 0xB3)
> v0:
>  - s/aml_string/aml_eisaid/
>    /fixes Windows BSOD, on nonsense value/ (Laszlo Ersek
>    <lersek@redhat.com>)
>  - put SMI device under PCI0 like the rest of hotplug IO port
>  - do not generate SMI AML if CPU hotplug SMI feature hasn't been
>    negotiated
> ---
>  hw/acpi/cpu.c | 156 +++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 129 insertions(+), 27 deletions(-)
>
> diff --git a/hw/acpi/cpu.c b/hw/acpi/cpu.c
> index 3d6a500fb7..1283972001 100644
> --- a/hw/acpi/cpu.c
> +++ b/hw/acpi/cpu.c
> @@ -14,6 +14,8 @@
>  #define ACPI_CPU_CMD_DATA_OFFSET_RW 8
>  #define ACPI_CPU_CMD_DATA2_OFFSET_R 0
>
> +#define OVMF_CPUHP_SMI_CMD 4
> +
>  enum {
>      CPHP_GET_NEXT_CPU_WITH_EVENT_CMD = 0,
>      CPHP_OST_EVENT_CMD = 1,
> @@ -321,6 +323,7 @@ const VMStateDescription vmstate_cpu_hotplug = {
>  #define CPU_NOTIFY_METHOD "CTFY"
>  #define CPU_EJECT_METHOD  "CEJ0"
>  #define CPU_OST_METHOD    "COST"
> +#define CPU_ADDED_LIST    "CNEW"
>
>  #define CPU_ENABLED       "CPEN"
>  #define CPU_SELECTOR      "CSEL"
> @@ -465,42 +468,141 @@ void build_cpus_aml(Aml *table, MachineState *machine, 
> CPUHotplugFeatures opts,
>
>          method = aml_method(CPU_SCAN_METHOD, 0, AML_SERIALIZED);
>          {
> +            const uint8_t max_cpus_per_pass = 255;
>              Aml *else_ctx;
> -            Aml *while_ctx;
> +            Aml *while_ctx, *while_ctx2;
>              Aml *has_event = aml_local(0);
>              Aml *dev_chk = aml_int(1);
>              Aml *eject_req = aml_int(3);
>              Aml *next_cpu_cmd = aml_int(CPHP_GET_NEXT_CPU_WITH_EVENT_CMD);
> +            Aml *num_added_cpus = aml_local(1);
> +            Aml *cpu_idx = aml_local(2);
> +            Aml *uid = aml_local(3);
> +            Aml *has_job = aml_local(4);
> +            Aml *new_cpus = aml_name(CPU_ADDED_LIST);
>
>              aml_append(method, aml_acquire(ctrl_lock, 0xFFFF));
> -            aml_append(method, aml_store(one, has_event));
> -            while_ctx = aml_while(aml_equal(has_event, one));
> +
> +            /*
> +             * Windows versions newer than XP (including Windows 10/Windows
> +             * Server 2019), do support* VarPackageOp but, it is cripled to 
> hold
> +             * the same elements number as old PackageOp.
> +             * For compatibility with Windows XP (so it won't crash) use 
> ACPI1.0
> +             * PackageOp which can hold max 255 elements.
> +             *
> +             * use named package as old Windows don't support it in local var
> +             */
> +            aml_append(method, aml_name_decl(CPU_ADDED_LIST,
> +                                             
> aml_package(max_cpus_per_pass)));
> +
> +            aml_append(method, aml_store(zero, uid));
> +            aml_append(method, aml_store(one, has_job));
> +            /*
> +             * CPU_ADDED_LIST can hold limited number of elements, outer loop
> +             * allows to process CPUs in batches which let us to handle more
> +             * CPUs than CPU_ADDED_LIST can hold.
> +             */
> +            while_ctx2 = aml_while(aml_equal(has_job, one));
>              {
> -                 /* clear loop exit condition, ins_evt/rm_evt checks
> -                  * will set it to 1 while next_cpu_cmd returns a CPU
> -                  * with events */
> -                 aml_append(while_ctx, aml_store(zero, has_event));
> -                 aml_append(while_ctx, aml_store(next_cpu_cmd, cpu_cmd));
> -                 ifctx = aml_if(aml_equal(ins_evt, one));
> -                 {
> -                     aml_append(ifctx,
> -                         aml_call2(CPU_NOTIFY_METHOD, cpu_data, dev_chk));
> -                     aml_append(ifctx, aml_store(one, ins_evt));
> -                     aml_append(ifctx, aml_store(one, has_event));
> -                 }
> -                 aml_append(while_ctx, ifctx);
> -                 else_ctx = aml_else();
> -                 ifctx = aml_if(aml_equal(rm_evt, one));
> -                 {
> -                     aml_append(ifctx,
> -                         aml_call2(CPU_NOTIFY_METHOD, cpu_data, eject_req));
> -                     aml_append(ifctx, aml_store(one, rm_evt));
> -                     aml_append(ifctx, aml_store(one, has_event));
> -                 }
> -                 aml_append(else_ctx, ifctx);
> -                 aml_append(while_ctx, else_ctx);
> +                aml_append(while_ctx2, aml_store(zero, has_job));
> +
> +                aml_append(while_ctx2, aml_store(one, has_event));
> +                aml_append(while_ctx2, aml_store(zero, num_added_cpus));
> +
> +                /*
> +                 * Scan CPUs, till there are CPUs with events or
> +                 * CPU_ADDED_LIST capacity is exhausted
> +                 */
> +                while_ctx = aml_while(aml_land(aml_equal(has_event, one),
> +                                      aml_lless(uid, 
> aml_int(arch_ids->len))));
> +                {
> +                     /*
> +                      * clear loop exit condition, ins_evt/rm_evt checks will
> +                      * set it to 1 while next_cpu_cmd returns a CPU with 
> events
> +                      */
> +                     aml_append(while_ctx, aml_store(zero, has_event));
> +
> +                     aml_append(while_ctx, aml_store(uid, cpu_selector));
> +                     aml_append(while_ctx, aml_store(next_cpu_cmd, cpu_cmd));
> +
> +                     /*
> +                      * wrap around case, scan is complete, exit loop.
> +                      * It happens since events are not cleared in scan loop,
> +                      * so next_cpu_cmd continues to find already processed 
> CPUs
> +                      */
> +                     ifctx = aml_if(aml_lless(cpu_data, uid));
> +                     {
> +                         aml_append(ifctx, aml_break());
> +                     }
> +                     aml_append(while_ctx, ifctx);
> +
> +                     /*
> +                      * if CPU_ADDED_LIST is full, exit inner loop and 
> process
> +                      * collected CPUs
> +                      */
> +                     ifctx = aml_if(
> +                         aml_equal(num_added_cpus, 
> aml_int(max_cpus_per_pass)));
> +                     {
> +                         aml_append(ifctx, aml_store(one, has_job));
> +                         aml_append(ifctx, aml_break());
> +                     }
> +                     aml_append(while_ctx, ifctx);
> +
> +                     aml_append(while_ctx, aml_store(cpu_data, uid));
> +                     ifctx = aml_if(aml_equal(ins_evt, one));
> +                     {
> +                         /* cache added CPUs to Notify/Wakeup later */
> +                         aml_append(ifctx, aml_store(uid,
> +                             aml_index(new_cpus, num_added_cpus)));
> +                         aml_append(ifctx, aml_increment(num_added_cpus));
> +                         aml_append(ifctx, aml_store(one, has_event));
> +                     }
> +                     aml_append(while_ctx, ifctx);
> +                     else_ctx = aml_else();
> +                     ifctx = aml_if(aml_equal(rm_evt, one));
> +                     {
> +                         aml_append(ifctx,
> +                             aml_call2(CPU_NOTIFY_METHOD, uid, eject_req));
> +                         aml_append(ifctx, aml_store(one, rm_evt));
> +                         aml_append(ifctx, aml_store(one, has_event));
> +                     }
> +                     aml_append(else_ctx, ifctx);
> +                     aml_append(while_ctx, else_ctx);
> +                     aml_append(while_ctx, aml_increment(uid));
> +                }
> +                aml_append(while_ctx2, while_ctx);
> +
> +                /*
> +                 * in case FW negotiated ICH9_LPC_SMI_F_CPU_HOTPLUG_BIT,
> +                 * make upcall to FW, so it can pull in new CPUs before
> +                 * OS is notified and wakes them up
> +                 */
> +                if (opts.smi_path) {
> +                    ifctx = aml_if(aml_lgreater(num_added_cpus, zero));
> +                    {
> +                        aml_append(ifctx, 
> aml_store(aml_int(OVMF_CPUHP_SMI_CMD),
> +                            aml_name("%s", opts.smi_path)));
> +                    }
> +                    aml_append(while_ctx2, ifctx);
> +                }
> +
> +                /* Notify OSPM about new CPUs and clear insert events */
> +                aml_append(while_ctx2, aml_store(zero, cpu_idx));
> +                while_ctx = aml_while(aml_lless(cpu_idx, num_added_cpus));
> +                {
> +                    aml_append(while_ctx,
> +                        aml_store(aml_derefof(aml_index(new_cpus, cpu_idx)),
> +                                  uid));
> +                    aml_append(while_ctx,
> +                        aml_call2(CPU_NOTIFY_METHOD, uid, dev_chk));
> +                    aml_append(while_ctx, aml_store(uid, aml_debug()));
> +                    aml_append(while_ctx, aml_store(uid, cpu_selector));
> +                    aml_append(while_ctx, aml_store(one, ins_evt));
> +                    aml_append(while_ctx, aml_increment(cpu_idx));
> +                }
> +                aml_append(while_ctx2, while_ctx);
>              }
> -            aml_append(method, while_ctx);
> +            aml_append(method, while_ctx2);
>              aml_append(method, aml_release(ctrl_lock));
>          }
>          aml_append(cpus_dev, method);
>

Here's the ASL decompiled, and using the meaningful local variable names
(and 8 possible VCPUs):

   1              Method (CSCN, 0, Serialized)
   2              {
   3                  Acquire (\_SB.PCI0.PRES.CPLK, 0xFFFF)
   4                  Name (CNEW, Package (0xFF){})
   5                  Local_Uid = Zero
   6                  Local_HasJob = One
   7                  While ((Local_HasJob == One))
   8                  {
   9                      Local_HasJob = Zero
  10                      Local_HasEvent = One
  11                      Local_NumAddedCpus = Zero
  12                      While (((Local_HasEvent == One) && (Local_Uid < 
0x08)))
  13                      {
  14                          Local_HasEvent = Zero
  15                          \_SB.PCI0.PRES.CSEL = Local_Uid
  16                          \_SB.PCI0.PRES.CCMD = Zero
  17                          If ((\_SB.PCI0.PRES.CDAT < Local_Uid))
  18                          {
  19                              Break
  20                          }
  21
  22                          If ((Local_NumAddedCpus == 0xFF))
  23                          {
  24                              Local_HasJob = One
  25                              Break
  26                          }
  27
  28                          Local_Uid = \_SB.PCI0.PRES.CDAT
  29                          If ((\_SB.PCI0.PRES.CINS == One))
  30                          {
  31                              CNEW [Local_NumAddedCpus] = Local_Uid
  32                              Local_NumAddedCpus++
  33                              Local_HasEvent = One
  34                          }
  35                          ElseIf ((\_SB.PCI0.PRES.CRMV == One))
  36                          {
  37                              CTFY (Local_Uid, 0x03)
  38                              \_SB.PCI0.PRES.CRMV = One
  39                              Local_HasEvent = One
  40                          }
  41
  42                          Local_Uid++
  43                      }
  44
  45                      If ((Local_NumAddedCpus > Zero))
  46                      {
  47                          \_SB.PCI0.SMI0.SMIC = 0x04
  48                      }
  49
  50                      Local_CpuIdx = Zero
  51                      While ((Local_CpuIdx < Local_NumAddedCpus))
  52                      {
  53                          Local_Uid = DerefOf (CNEW [Local_CpuIdx])
  54                          CTFY (Local_Uid, One)
  55                          Debug = Local_Uid
  56                          \_SB.PCI0.PRES.CSEL = Local_Uid
  57                          \_SB.PCI0.PRES.CINS = One
  58                          Local_CpuIdx++
  59                      }
  60                  }
  61
  62                  Release (\_SB.PCI0.PRES.CPLK)
  63              }

When we take the Break on line 25, then:

(a) on line 25, the following equality holds:

  Local_Uid == CNEW[Local_NumAddedCpus - 1] + 1

(b) on line 60, the following equality holds:

  Local_Uid == CNEW[Local_NumAddedCpus - 1]

This means that, when we write Local_Uid to CSEL on line 15 again, then:

- we effectively re-investigate the last-cached CPU (with selector value
  CNEW[Local_NumAddedCpus - 1])

- rather than resuming the scanning right after it -- that is, with
  selector value (CNEW[Local_NumAddedCpus - 1] + 1) --, in the spirit of
  line 42.

My question is: is this "rewind" intentional?

Now, I don't see a functionality problem with this, as on line 57, we
clear the pending insert event for the last-cached CPU, so when we
re-check it, the "get pending" command will simply seek past it.

But I'd like to understand if this is *precisely* your intent, or if
it's an oversight and it just ends up working OK.

Basically my question is whether we should add, on top:

> --- cscn.v5     2020-09-07 15:02:13.401663487 +0200
> +++ cscn.v5.incr        2020-09-07 16:52:22.133843710 +0200
> @@ -57,6 +57,10 @@
>                          \_SB.PCI0.PRES.CINS = One
>                          Local_CpuIdx++
>                      }
> +
> +                    if ((Local_HasJob == One)) {
> +                        Local_Uid++
> +                    }
>                  }
>
>                  Release (\_SB.PCI0.PRES.CPLK)

If not -- that is, the currently proposed patch is intentional --, then
I think we should add a comment, about the effective "rewind" being
intentional & okay.

(Note: it's certainly valid and necessary to re-write CSEL on line 15
after raising the SMI on line 47; the question is not whether we should
or should not re-write CSEL (we must!), but the specific value that we
write to CSEL.)

So:

- If the outer loop body is entered only once, then the patch looks
  fine, from both the review side, and the testing side (I tested it
  with 4-8 possible VCPUs).

- If the outer loop body is entered twice or more, then from the review
  side, please my question above. From the testing side: do you have an
  environment where I could test this with OVMF?

(I expect it to work OK. Upon the first SMI, the firmware will likely
pick up more VCPUs than what's in CNEW. But edk2 commit 020bb4b46d6f
("OvmfPkg/CpuHotplugSmm: fix CPU hotplug race just before SMI
broadcast", 2020-08-27) should deal with that.)

Hmm, actually, there's no need for a special environment: I can patch
QEMU and lower "max_cpus_per_pass" to something small, such as "3" or
whatever, for testing the outer loop multiple times. But first I'd like
to know your thoughts on the "rewind".

Thanks!
Laszlo




reply via email to

[Prev in Thread] Current Thread [Next in Thread]