qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID


From: Peter Xu
Subject: Re: [PATCH 2/2] apic: Use 32bit APIC ID for migration instance ID
Date: Tue, 15 Oct 2019 18:16:41 +0800
User-agent: Mutt/1.11.4 (2019-03-13)

On Tue, Oct 15, 2019 at 10:22:18AM +0100, Dr. David Alan Gilbert wrote:
> * Peter Xu (address@hidden) wrote:
> > Migration is silently broken now with x2apic config like this:
> > 
> >      -smp 200,maxcpus=288,sockets=2,cores=72,threads=2 \
> >      -device intel-iommu,intremap=on,eim=on
> > 
> > After migration, the guest kernel could hang at anything, due to
> > x2apic bit not migrated correctly in IA32_APIC_BASE on some vcpus, so
> > any operations related to x2apic could be broken then (e.g., RDMSR on
> > x2apic MSRs could fail because KVM would think that the vcpu hasn't
> > enabled x2apic at all).
> > 
> > The issue is that the x2apic bit was never applied correctly for vcpus
> > whose ID > 255 when migrate completes, and that's because when we
> > migrate APIC we use the APICCommonState.id as instance ID of the
> > migration stream, while that's too short for x2apic.
> > 
> > Let's use the newly introduced initial_apic_id for that.
> 
> I'd like to understand a few things:
>    a) Does this change the instance ID of existing APICs on the
> migration stream? 
>      a1) Ever for <256 CPUs?

No.

>      a2) For >=256 CPUs?

Yes.

> 
>     [Because changing the ID breaks migration]

But if we don't change it, the stream is broken too. :)

Then the destination VM will receive e.g. two apic_id==0 instances (I
think the apic_id==256 instance will wrongly overwrite the apic_id==0
one), while the vcpu with apic_id==256 will use the initial apic
values.

So IMHO we should still fix this, even if it changes the migration
stream.  At least we start to make it right.

> 
>   b) Is the instance ID constant - I can see it's a property on the
>      APIC, but I cna't see who sets it

For each vcpu, I think yes it should be a constant as long as the
topology is the same.  This is how I understand it to be set:

(1) In pc_cpus_init(), we init these:

    possible_cpus = mc->possible_cpu_arch_ids(ms);
    for (i = 0; i < ms->smp.cpus; i++) {
        pc_new_cpu(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
    }

(2) In x86_cpu_apic_create(), we apply the apic_id to "id" property:

    qdev_prop_set_uint32(cpu->apic_state, "id", cpu->apic_id);

> 
>   c) In the case where it fails, did we end up registering two
>      devices with the same name and instance ID?  If so, is it worth
>      adding a check that would error if we tried?

Sounds doable.

Thanks,

-- 
Peter Xu



reply via email to

[Prev in Thread] Current Thread [Next in Thread]