[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [PATCH RFC 00/22] Support of Virtual CPU Hotplug for ARMv8 Arch
From: |
Salil Mehta |
Subject: |
RE: [PATCH RFC 00/22] Support of Virtual CPU Hotplug for ARMv8 Arch |
Date: |
Tue, 23 Jun 2020 09:56:54 +0000 |
> From: Andrew Jones [mailto:drjones@redhat.com]
> Sent: Tuesday, June 23, 2020 10:12 AM
> To: Salil Mehta <salil.mehta@huawei.com>
> Cc: qemu-devel@nongnu.org; qemu-arm@nongnu.org; peter.maydell@linaro.org;
> sudeep.holla@arm.com; gshan@redhat.com; mst@redhat.com; jiakernel2@gmail.com;
> maz@kernel.org; zhukeqian <zhukeqian1@huawei.com>; david@redhat.com;
> richard.henderson@linaro.org; Linuxarm <linuxarm@huawei.com>;
> eric.auger@redhat.com; james.morse@arm.com; catalin.marinas@arm.com;
> imammedo@redhat.com; pbonzini@redhat.com; mehta.salil.lnk@gmail.com;
> maran.wilson@oracle.com; will@kernel.org; wangxiongfeng (C)
> <wangxiongfeng2@huawei.com>
> Subject: Re: [PATCH RFC 00/22] Support of Virtual CPU Hotplug for ARMv8 Arch
>
> On Sat, Jun 13, 2020 at 10:36:07PM +0100, Salil Mehta wrote:
> > This patch-set introduces the virtual cpu hotplug support for ARMv8
> > architecture in QEMU. Idea is to be able to hotplug and hot-unplug the vcpus
> > while guest VM is running and no reboot is required. This does *not* makes
> any
> > assumption of the physical cpu hotplug availability within the host system
> but
> > rather tries to solve the problem at virtualizer/QEMU layer and by
> > introducing
> > cpu hotplug hooks and event handling within the guest kernel. No changes are
> > required within the host kernel/KVM.
> >
> > Motivation:
> > This allows scaling the guest VM compute capacity on-demand which would be
> > useful for the following example scenarios,
> > 1. Vertical Pod Autoscaling[3][4] in the cloud: Part of the orchestration
> > framework which could adjust resource requests (CPU and Mem requests) for
> > the containers in a pod, based on usage.
> > 2. Pay-as-you-grow Business Model: Infrastructure provider could allocate
> > and
> > restrict the total number of compute resources available to the guest VM
> > according to the SLA(Service Level Agreement). VM owner could request for
> > more compute to be hot-plugged for some cost.
> >
> > Terminology:
> >
> > (*) Present cpus: Total cpus with which guest has/will boot and are
> > available
> > to guest for use and can be onlined. Qemu parameter(-smp)
> > (*) Disabled cpus: Possible cpus which will not be available for the guest
> to
> > use. These can be hotplugged and made present. These can
> > be
> > thought of as un-plugged vcpus. These will be included as
> > part of sizing.
> > (*) Posssible cpus: Total vcpus which could ever exist in VM. This includes
> > booted cpus plus any cpus which could be later plugged.
> > - Qemu parameter(-maxcpus)
> > - Possible vcpus = Present vcpus (+) Disabled vcpus
> >
> >
> > Limitations of ARMv8 Architecture:
> >
> > A. Physical Limitation to CPU Hotplug:
> > 1. ARMv8 architecture does not support the concept of the physical cpu
> > hotplug.
> > The closest thing which is recomended to achieve the cpu hotplug on ARM
> is
> > to bring down power state of the cpu using PSCI.
> > 2. Other ARM components like GIC etc. have not been designed to realize
> > physical cpu hotplug capability as of now.
> >
> > B. Limitations of GIC to Support Virtual CPU Hotplug:
> > 1. GIC requires various resources(related to GICR/redistributor, GICC/cpu
> > interface etc) like memory regions to be fixed at the VM init time and
> > these
> > could not be changed later on after VM has inited.
> > 2. Associations between GICC(GIC cpu interface) and vcpu get fixed at the VM
> > init time and GIC does not allows to change this association once GIC has
> > initialized.
> >
> > C. Known Limitation of the KVM:
> > 1. As of now KVM allows to create VCPUs but does not allows to delete the
> > already created vcpus. QEMU already provides an interface to manage
> > created
> > vcpus at KVM level and then to re-use them.
> > 2. Inconsistency in interpretation of the MPIDR generated by KVM for vcpus
> > vis-a-vis SMT/threads. This does not looks to be compliant to the MPIDR
> > format(SMT is present) as mentioned in the ARMv8 spec. (Please correct my
> > understanding if I am wrong here?)
> >
> >
> > Workaround to the problems mentioned in Section B & C1:
> > 1. We pre-size the GIC with possible vcpus at VM init time
> > 2. Pre-create all possible vcpus at KVM and associate them with GICC
> > 3. Park the unplugged vcpus (similar to x86)
> >
> >
> > (*) For all of above please refer to Marc's suggestion here[1]
> >
> >
> > Overview of the Approach:
> > At the time of machvirt_init() we pre-create all of the possible ARMCPU
> > objects along with the corresponding KVM vcpus at the host. Disabled KVM
> > vcpu
> > (which are *not* "present" vcpus but are part of "possible" vcpu list) are
> > parked at per VM list "kvm_parked_vcpus" after their initialization.
> >
> > We create the ARMCPU objects(but these are not *realized* in QOM sense) even
> > for the disabled vcpus to facilitate the GIC initialization (pre-sized with
> > possible vcpus). After Initialization of the machine is complete we release
> > the ARMCPU Objects for the disabled vcpus. These ARMCPU object shall be
> > re-created at the time when vcpu is hot plugged. This new object is then
> > re-attached with the earlier parked KVM vcpu which also gets unparked. The
> > ARMCPU object gets now "realized" in QEMU, which means creation of the
> > corresponding threads, pre_plug/plug phases, and event notification to the
> > guest using ACPI GED etc. Similarly, hot-unplug leg will lead to the
> > "unrealization" of the vcpus and will lead to similar ACPI GED events to the
> > guest for unplug and cleanup and eventually ARMCPU object shall be released
> and
> > KVM vcpus shall be parked again.
> >
> > During machine init, ACPI MADT Table is sized with *possible* vcpus GICC
> > entries. The unplugged/disabled vcpus are presented as MADT GICC DISABLED
> > entries to the guest. This means the guest will have its resources pre-sized
> > with possible vcpus(=present+disabled)
> >
> > Other approaches to deal with ARMCPU object release(after machine init):
> > 1. The ARMCPU objects for the disabled vcpus are released in context to the
> > virt_machine_done() notifier(approach adopted in this patch-set).
> > 2. Defer the release of current ARMCPU object till the new vcpu object is
> > hot plugged.
> > 3. Never release and keep on reusing them and release once at VM exit. This
> > solves many problems with above 2 approaches but requires change in the
> way
> > qdev_device_add() fetches/creates the ARMCPU object for the new vcpus
> > being
> > hotplugged. For the arm cpu hotplug case we need to figure out way how to
> > get access to old object and use it to "re-realize" instead of the new
> > ARMCPU object.
> >
> > Concerns/Questions:
> > 1. In ARM arch a cpu is uniquely represented in hierarchy using various
> > affinity levels which could represent thread, core, cluster, package.
> > This
> > is generally represented by a value in MPIDR register as per the format
> > mentioned in specification. Now, the way MPIDR value is derived for vcpus
> is
> > done using vcpu-index. The concept of thread is not quite as same and
> > rather
> > gets lost in the derivation of MPIDR for vcpus.
> > 2. The topology info used to specify the vcpu while hot-plugging might not
> > match with the MPIDR value given back by the KVM for the vcpu at the time
> of
> > init. Concept of SMT bit in MPIDR gets lost as per the derivation being
> done
> > in the KVM. Hence, concept of thread-id, core-id, socket-id if used as a
> > topology info to derive MPIDR value as per ARM specification will not
> > match
> > with MPIDR actually assigned by the KVM?
> > Perhaps need to carry forward work of Andrew? please check here[2]
> > 3. Further if this info is supplied to the guest using PPTT(once introduced
> in
> > QEMU) or even derived using MPIDR shall be inconsistent with the host
> > vcpu.
> > 4. Any possibilities of interrupts(SGI/PPI/LPI/SPI) always remaining in
> > *pending* state for the cpus which have been hot-unplugged? IMHO it looks
> > okay but will need Marc's confirmation on this.
> > 5. If the ARMCPU object is released after the machine init, UEFI could call
> > back virt_update_table() to re-build the ACPI tables which might need an
> > ARMCPU object. Please check the discussion here[5]
> >
> >
> > Commands Used:
> >
> > A. Qemu launch commands to init the machine
> >
> > $ qemu-system-aarch64 --enable-kvm -machine virt,gic-version=3 \
> > -cpu host -smp cpus=4,maxcpus=6 \
> > -m 300M \
> > -kernel Image \
> > -initrd rootfs.cpio.gz \
> > -append "console=ttyAMA0 root=/dev/ram rdinit=/init maxcpus=2 acpi=force" \
> > -nographic \
> > -bios QEMU_EFI.fd \
> >
> > B. Hot-(un)plug related commands
> >
> > # Hotplug a host vcpu(accel=kvm)
> > $ device_add host-arm-cpu,id=core4,core-id=4
> >
> > # Hotplug a vcpu(accel=tcg)
> > $ device_add cortex-a57-arm-cpu,id=core4,core-id=4
> >
> > # Delete the vcpu
> > $ device_del core4
> >
> > NOTE: I have not tested the current solution with '-device' interface. The
> use
> > is suggested by Igor here[6]. I will test this in coming times but
> > looks
> > it should work with existing changes.
> >
> >
> > Sample output on guest after boot:
> >
> > $ cat /sys/devices/system/cpu/possible
> > 0-5
> > $ cat /sys/devices/system/cpu/present
> > 0-3
> > $ cat /sys/devices/system/cpu/online
> > 0-1
> > $ cat /sys/devices/system/cpu/offline
> > 2-5
> >
> >
> > Sample output on guest after hotplug of vcpu=4:
> >
> > $ cat /sys/devices/system/cpu/possible
> > 0-5
> > $ cat /sys/devices/system/cpu/present
> > 0-4
> > $ cat /sys/devices/system/cpu/online
> > 0-1,4
> > $ cat /sys/devices/system/cpu/offline
> > 2-3,5
> >
> > Note: vcpu=4 was explicitly 'onlined' after hot-plug
> > $ echo 1 > /sys/devices/system/cpu/cpu4/online
> >
> >
> > Repository:
> > (*) QEMU changes for vcpu hotplug could be cloned from below site,
> > https://github.com/salil-mehta/qemu.git virt-cpuhp-armv8/rfc-v1
> >
> > (*) Guest Kernel changes required to co-work with the QEMU shall be posted
> soon
> > and repo made available at above site.
> >
> >
> > THINGS TO DO:
> > (*) Migration support
> > (*) TCG/Emulation support is not proper right now. Works to a certain
> > extent
> > but is not complete. especially the unrealize part in which there is a
> > overflow of tcg contexts. The last is due to the fact tcg maintains a
> > count on number of context(per thread instance) so as we hotplug the
> > vcpus
> > this counter keeps on incrementing. But during hot-unplug the counter
> is
> > not decremented.
> > (*) Support of hotplug with NUMA is not proper
> > (*) CPU Topology right now is not specified using thread/core/socket but
> > rather flatly indexed using core-id. This needs consideration[2].
> > (*) Do we need PPTT Support for to specify right topology info to guest
> > about
> > hot-plugged or unplugged vcpus?
> > (*) Test cases
> > (*) Docs need to be updated.
> >
> >
>
> Hi Salil,
>
> I realize this is just a preliminary posting and the approach hasn't been
> finalized, but maybe in a future posting we can put a lot of this
> information into a doc patch. I think we'll need good documentation for
> this feature to ensure we get it right and keep in maintained correctly.
Sure, let us do it once we converge on the concept.
Thanks
Salil.
- [PATCH RFC 18/22] arm/cpuhp: Changes to update GIC with vcpu hot-plug notification, (continued)
- [PATCH RFC 18/22] arm/cpuhp: Changes to update GIC with vcpu hot-plug notification, Salil Mehta, 2020/06/13
- [PATCH RFC 19/22] arm/cpuhp: Changes required to (re)init the vcpu register info, Salil Mehta, 2020/06/13
- [PATCH RFC 20/22] arm/cpuhp: Update the guest(via GED) about cpu hot-(un)plug events, Salil Mehta, 2020/06/13
- [PATCH RFC 21/22] arm/cpuhp: Changes required for reset and to support next boot, Salil Mehta, 2020/06/13
- [PATCH RFC 22/22] arm/cpuhp: Add support of *unrealize* ARMCPU during vcpu hot-unplug, Salil Mehta, 2020/06/13
- Re: [PATCH RFC 00/22] Support of Virtual CPU Hotplug for ARMv8 Arch, no-reply, 2020/06/13
- Re: [PATCH RFC 00/22] Support of Virtual CPU Hotplug for ARMv8 Arch, no-reply, 2020/06/13
- Re: [PATCH RFC 00/22] Support of Virtual CPU Hotplug for ARMv8 Arch, Marc Zyngier, 2020/06/14
- Re: [PATCH RFC 00/22] Support of Virtual CPU Hotplug for ARMv8 Arch, Andrew Jones, 2020/06/23
- RE: [PATCH RFC 00/22] Support of Virtual CPU Hotplug for ARMv8 Arch,
Salil Mehta <=