[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID
From: |
Eduardo Habkost |
Subject: |
Re: [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID |
Date: |
Tue, 15 Oct 2019 16:49:07 -0300 |
On Tue, Oct 15, 2019 at 12:02:53PM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (address@hidden) wrote:
> > On Tue, Oct 15, 2019 at 10:22:18AM +0100, Dr. David Alan Gilbert wrote:
> > > * Peter Xu (address@hidden) wrote:
> > > > Migration is silently broken now with x2apic config like this:
> > > >
> > > > -smp 200,maxcpus=288,sockets=2,cores=72,threads=2 \
> > > > -device intel-iommu,intremap=on,eim=on
> > > >
> > > > After migration, the guest kernel could hang at anything, due to
> > > > x2apic bit not migrated correctly in IA32_APIC_BASE on some vcpus, so
> > > > any operations related to x2apic could be broken then (e.g., RDMSR on
> > > > x2apic MSRs could fail because KVM would think that the vcpu hasn't
> > > > enabled x2apic at all).
> > > >
> > > > The issue is that the x2apic bit was never applied correctly for vcpus
> > > > whose ID > 255 when migrate completes, and that's because when we
> > > > migrate APIC we use the APICCommonState.id as instance ID of the
> > > > migration stream, while that's too short for x2apic.
> > > >
> > > > Let's use the newly introduced initial_apic_id for that.
> > >
> > > I'd like to understand a few things:
> > > a) Does this change the instance ID of existing APICs on the
> > > migration stream?
> > > a1) Ever for <256 CPUs?
> >
> > No.
> >
> > > a2) For >=256 CPUs?
> >
> > Yes.
> >
> > >
> > > [Because changing the ID breaks migration]
> >
> > But if we don't change it, the stream is broken too. :)
> >
> > Then the destination VM will receive e.g. two apic_id==0 instances (I
> > think the apic_id==256 instance will wrongly overwrite the apic_id==0
> > one), while the vcpu with apic_id==256 will use the initial apic
> > values.
> >
> > So IMHO we should still fix this, even if it changes the migration
> > stream. At least we start to make it right.
>
> Yes, that makes sense.
> It deserves a doc mention somewhere.
>
> > >
> > > b) Is the instance ID constant - I can see it's a property on the
> > > APIC, but I cna't see who sets it
> >
> > For each vcpu, I think yes it should be a constant as long as the
> > topology is the same. This is how I understand it to be set:
> >
> > (1) In pc_cpus_init(), we init these:
> >
> > possible_cpus = mc->possible_cpu_arch_ids(ms);
> > for (i = 0; i < ms->smp.cpus; i++) {
> > pc_new_cpu(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
> > }
> >
> > (2) In x86_cpu_apic_create(), we apply the apic_id to "id" property:
> >
> > qdev_prop_set_uint32(cpu->apic_state, "id", cpu->apic_id);
>
> OK, that's fine - as long as it's constaatn and not guest influenced.
The guest may change the CPU APIC ID (although they rarely do),
but I believe X86CPU::apic_id is always going to be the initial
APIC ID. I'll double check (and maybe send a patch to rename it
to initial_apic_id).
--
Eduardo
- Re: [PATCH 1/2] migration: Boost SaveStateEntry.instance_id to 64 bits, (continued)
[PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID, Peter Xu, 2019/10/15