qemu-ppc
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)


From: David Gibson
Subject: Re: [Qemu-ppc] [PATCH 0/5] spapr: DRC cleanups (part VI)
Date: Wed, 5 Jul 2017 21:04:14 +1000
User-agent: Mutt/1.8.0 (2017-02-23)

On Tue, Jul 04, 2017 at 06:13:31PM -0300, Daniel Henrique Barboza wrote:
> I just tested this patch set on top of current ppc-for-2.10 branch (which
> contains
> the patches from part V). It applied cleanly but required a couple of
> trivial
> fixes to build probably because it was made on top of an older code base.

Right, I fixed that up locally already, but haven't gotten around to
reposting yet.  You can look at the 'drcVI' branch on my github tree,
if you're interested.

> The trivial migration test worked fine. The libvirt scenario (attaching a
> device on
> target before migration, try to unplug after migration) isn't working as
> expected
> but we have a different result with this series. Instead of silently failing
> to unplug
> with error messages on dmesg, the hot unplug works on QEMU level:

Thanks for testing.  Just to clarify what you're saying here, you
haven't spotted a regression with this series, but there is a case
which was broken and is still broken with slightly different
symptoms.  Yes?

> 
> (qemu) device_del core1
> (qemu)
> (qemu) info cpus
> * CPU #0: nip=0xc0000000000a3e0c thread_id=86162
> (qemu) info hotpluggable-cpus
> Hotpluggable CPUs:
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   CPUInstance Properties:
>     core-id: "3"
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   CPUInstance Properties:
>     core-id: "2"
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   CPUInstance Properties:
>     core-id: "1"
>   type: "host-spapr-cpu-core"
>   vcpus_count: "1"
>   qom_path: "/machine/unattached/device[0]"
>   CPUInstance Properties:
>     core-id: "0"
> (qemu)
> 
> 
> However, any operation on the guest afterwards (tried with lscpu and dmesg)
> seems
> to hung the guest. This is what I got when trying to do a dmesg after the
> hot unplug:

Ouch.  That's bad.  I'll have to look into it.

I have rather a lot on my plate at the moment - if you get a chance to
work out which of the patches in the series causes this behaviour,
that could be handy.

> address@hidden:~$
> address@hidden:~$
> address@hidden:~$ dmesg | tail -n 5
> ^C
> 
> [  362.749693] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
> [  362.749819]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  362.749892] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  362.750224] INFO: task kworker/0:3:1887 blocked for more than 120
> seconds.
> [  362.750320]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  362.750394] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  483.568842] INFO: task kworker/u8:1:30 blocked for more than 120 seconds.
> [  483.568997]       Not tainted 4.10.0-26-generic #30-Ubuntu
> [  483.569067] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables
> this message.
> [  483.569364] INFO: task kworker/0:3:1887 blocked for more than 120
> seconds.
> (...)
> 
> 
> I am not sure if it was intended for this scenario to work with this patch
> set already. Figured
> it can serve as a FYI for the next series.

This might be due to a problem I've discussed previously with
Laurent.  libvirt uses device_add on the destination's monitor before
sending the migration stream, rather than starting the destination
with already-hotplugged devices as -device.

To my surprise, there doesn't seem to be a system reset between that
device_add phase and processing the incoming stream, which means DRCs
for those devices might be in the wrong state.

I might need to explicitly trigger a reset of DRC state at incoming
migration time to handle this.

> Given that hotplug/unplug without migration still works and on the
> assumption that
> we'll look more into this libvirt scenario in the next spins, +1.
> 
> 
> Tested-by: Daniel Barboza <address@hidden>
> 
> 
> 
> On 06/21/2017 06:18 AM, David Gibson wrote:
> > This sixth set of DRC cleanup patches (to be applied on top of the
> > patches from part V) is a complete rework of DRC state management.  We
> > stop tracking some unnecessary things, and change the basic state
> > representation to a simpler and more robust model.
> > 
> > Most of the patches in this set "break" migration from earlier git
> > snapshots, but not from any released qemu version.  The previous
> > migration stream format had multiple problems, so better to fix them
> > now, before 2.10 is out.
> > 
> > David Gibson (5):
> >    spapr: Remove 'awaiting_allocation' DRC flag
> >    spapr: Refactor spapr_drc_detach()
> >    spapr: Cleanups relating to DRC awaiting_release field
> >    spapr: Consolidate DRC state variables
> >    spapr: Remove sPAPRConfigureConnectorState sub-structure
> > 
> >   hw/ppc/spapr.c             |   9 +-
> >   hw/ppc/spapr_drc.c         | 321 
> > +++++++++++++++++++++------------------------
> >   hw/ppc/spapr_pci.c         |  13 +-
> >   hw/ppc/trace-events        |   3 +-
> >   include/hw/ppc/spapr_drc.h |  53 +++++---
> >   5 files changed, 187 insertions(+), 212 deletions(-)
> > 
> 

-- 
David Gibson                    | I'll have my music baroque, and my code
david AT gibson.dropbear.id.au  | minimalist, thank you.  NOT _the_ _other_
                                | _way_ _around_!
http://www.ozlabs.org/~dgibson

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]