[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH v3 6/7] spapr_drc.c: add hotunplug timeout for CPUs
From: |
David Gibson |
Subject: |
Re: [PATCH v3 6/7] spapr_drc.c: add hotunplug timeout for CPUs |
Date: |
Wed, 17 Feb 2021 12:23:30 +1100 |
On Thu, Feb 11, 2021 at 07:52:45PM -0300, Daniel Henrique Barboza wrote:
> There is a reliable way to make a CPU hotunplug fail in the pseries
> machine. Hotplug a CPU A, then offline all other CPUs inside the guest
> but A. When trying to hotunplug A the guest kernel will refuse to do
> it, because A is now the last online CPU of the guest. PAPR has no
> 'error callback' in this situation to report back to the platform,
> so the guest kernel will deny the unplug in silent and QEMU will never
> know what happened. The unplug pending state of A will remain until
> the guest is shutdown or rebooted.
>
> Previous attempts of fixing it (see [1] and [2]) were aimed at trying to
> mitigate the effects of the problem. In [1] we were trying to guess which
> guest CPUs were online to forbid hotunplug of the last online CPU in the QEMU
> layer, avoiding the scenario described above because QEMU is now failing
> in behalf of the guest. This is not robust because the last online CPU of
> the guest can change while we're in the middle of the unplug process, and
> our initial assumptions are now invalid. In [2] we were accepting that our
> unplug process is uncertain and the user should be allowed to spam the IRQ
> hotunplug queue of the guest in case the CPU hotunplug fails.
>
> This patch presents another alternative, using the timeout infrastructure
> introduced in the previous patch. CPU hotunplugs in the pSeries machine will
> now timeout after 15 seconds. This is a long time for a single CPU unplug
> to occur, regardless of guest load - although the user is *strongly*
> encouraged
> to *not* hotunplug devices from a guest under high load - and we can be sure
> that something went wrong if it takes longer than that for the guest to
> release
> the CPU (the same can't be said about memory hotunplug - more on that in the
> next patch).
>
> Timing out the unplug operation will reset the unplug state of the CPU and
> allow the user to try it again, regardless of the error situation that
> prevented the hotunplug to occur. Of all the not so pretty fixes/mitigations
> for CPU hotunplug errors in pSeries, timing out the operation is an admission
> that we have no control in the process, and must assume the worst case if
> the operation doesn't succeed in a sensible time frame.
>
> [1] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg03353.html
> [2] https://lists.gnu.org/archive/html/qemu-devel/2021-01/msg04400.html
>
> Reported-by: Xujun Ma <xuma@redhat.com>
> Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1911414
> Signed-off-by: Daniel Henrique Barboza <danielhb413@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> ---
> hw/ppc/spapr.c | 4 ++++
> hw/ppc/spapr_drc.c | 17 +++++++++++++++++
> include/hw/ppc/spapr_drc.h | 3 +++
> 3 files changed, 24 insertions(+)
>
> diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
> index b066df68cb..ecce8abf14 100644
> --- a/hw/ppc/spapr.c
> +++ b/hw/ppc/spapr.c
> @@ -3724,6 +3724,10 @@ void spapr_core_unplug_request(HotplugHandler
> *hotplug_dev, DeviceState *dev,
> if (!spapr_drc_unplug_requested(drc)) {
> spapr_drc_unplug_request(drc);
> spapr_hotplug_req_remove_by_index(drc);
> + } else {
> + error_setg(errp, "core-id %d unplug is still pending, %d seconds "
> + "timeout remaining",
> + cc->core_id, spapr_drc_unplug_timeout_remaining_sec(drc));
Reporting this information is a nice touch.
> }
> }
>
> diff --git a/hw/ppc/spapr_drc.c b/hw/ppc/spapr_drc.c
> index c88bb524c5..c143bfb6d3 100644
> --- a/hw/ppc/spapr_drc.c
> +++ b/hw/ppc/spapr_drc.c
> @@ -398,6 +398,12 @@ void spapr_drc_unplug_request(SpaprDrc *drc)
>
> drc->unplug_requested = true;
>
> + if (drck->unplug_timeout_seconds != 0) {
> + timer_mod(drc->unplug_timeout_timer,
> + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL) +
> + drck->unplug_timeout_seconds * 1000);
> + }
> +
> if (drc->state != drck->empty_state) {
> trace_spapr_drc_awaiting_quiesce(spapr_drc_index(drc));
> return;
> @@ -406,6 +412,16 @@ void spapr_drc_unplug_request(SpaprDrc *drc)
> spapr_drc_release(drc);
> }
>
> +int spapr_drc_unplug_timeout_remaining_sec(SpaprDrc *drc)
> +{
> + if (drc->unplug_requested && timer_pending(drc->unplug_timeout_timer)) {
> + return
> (qemu_timeout_ns_to_ms(drc->unplug_timeout_timer->expire_time) -
> + qemu_clock_get_ms(QEMU_CLOCK_VIRTUAL)) / 1000;
Hmm. Reaching into the timer's internal fields isn't ideal. I wonder
if we should add a helper in the timer code for reporting this information.
> + }
> +
> + return 0;
> +}
> +
> bool spapr_drc_reset(SpaprDrc *drc)
> {
> SpaprDrcClass *drck = SPAPR_DR_CONNECTOR_GET_CLASS(drc);
> @@ -706,6 +722,7 @@ static void spapr_drc_cpu_class_init(ObjectClass *k, void
> *data)
> drck->drc_name_prefix = "CPU ";
> drck->release = spapr_core_release;
> drck->dt_populate = spapr_core_dt_populate;
> + drck->unplug_timeout_seconds = 15;
> }
>
> static void spapr_drc_pci_class_init(ObjectClass *k, void *data)
> diff --git a/include/hw/ppc/spapr_drc.h b/include/hw/ppc/spapr_drc.h
> index b2e6222d09..26599c385a 100644
> --- a/include/hw/ppc/spapr_drc.h
> +++ b/include/hw/ppc/spapr_drc.h
> @@ -211,6 +211,8 @@ typedef struct SpaprDrcClass {
>
> int (*dt_populate)(SpaprDrc *drc, struct SpaprMachineState *spapr,
> void *fdt, int *fdt_start_offset, Error **errp);
> +
> + int unplug_timeout_seconds;
> } SpaprDrcClass;
>
> typedef struct SpaprDrcPhysical {
> @@ -246,6 +248,7 @@ int spapr_dt_drc(void *fdt, int offset, Object *owner,
> uint32_t drc_type_mask);
> */
> void spapr_drc_attach(SpaprDrc *drc, DeviceState *d);
> void spapr_drc_unplug_request(SpaprDrc *drc);
> +int spapr_drc_unplug_timeout_remaining_sec(SpaprDrc *drc);
>
> /*
> * Reset all DRCs, causing pending hot-plug/unplug requests to complete.
--
David Gibson | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au | minimalist, thank you. NOT _the_ _other_
| _way_ _around_!
http://www.ozlabs.org/~dgibson
signature.asc
Description: PGP signature
- Re: [PATCH v3 2/7] spapr_pci.c: simplify spapr_pci_unplug_request() function handling, (continued)
[PATCH v3 4/7] spapr: rename spapr_drc_detach() to spapr_drc_unplug_request(), Daniel Henrique Barboza, 2021/02/11
[PATCH v3 3/7] spapr_drc.c: use spapr_drc_release() in isolate_physical/set_unusable, Daniel Henrique Barboza, 2021/02/11
[PATCH v3 6/7] spapr_drc.c: add hotunplug timeout for CPUs, Daniel Henrique Barboza, 2021/02/11
- Re: [PATCH v3 6/7] spapr_drc.c: add hotunplug timeout for CPUs,
David Gibson <=
[PATCH v3 5/7] spapr_drc.c: introduce unplug_timeout_timer, Daniel Henrique Barboza, 2021/02/11
[PATCH v3 7/7] spapr_drc.c: use DRC reconfiguration to cleanup DIMM unplug state, Daniel Henrique Barboza, 2021/02/11
Re: [PATCH v3 0/7] CPU unplug timeout/LMB unplug cleanup in DRC reconfiguration, David Gibson, 2021/02/16