qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type


From: Sergio Lopez
Subject: Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type
Date: Tue, 23 Jul 2019 10:43:28 +0200
User-agent: mu4e 1.2.0; emacs 26.2

Montes, Julio <address@hidden> writes:

> On Fri, 2019-07-19 at 16:09 +0100, Stefan Hajnoczi wrote:
>> On Fri, Jul 19, 2019 at 2:48 PM Sergio Lopez <address@hidden> wrote:
>> > Stefan Hajnoczi <address@hidden> writes:
>> > > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote:
>> > > > Stefan Hajnoczi <address@hidden> writes:
>> > > > 
>> > > > > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote:
>> > > >  --------------
>> > > >  | Conclusion |
>> > > >  --------------
>> > > > 
>> > > > The average boot time of microvm is a third of Q35's (115ms vs.
>> > > > 363ms),
>> > > > and is smaller on all sections (QEMU initialization, firmware
>> > > > overhead
>> > > > and kernel start-to-user).
>> > > > 
>> > > > Microvm's memory tree is also visibly simpler, significantly
>> > > > reducing
>> > > > the exposed surface to the guest.
>> > > > 
>> > > > While we can certainly work on making Q35 smaller, I definitely
>> > > > think
>> > > > it's better (and way safer!) having a specialized machine type
>> > > > for a
>> > > > specific use case, than a minimal Q35 whose behavior
>> > > > significantly
>> > > > diverges from a conventional Q35.
>> > > 
>> > > Interesting, so not a 10x difference!  This might be amenable to
>> > > optimization.
>> > > 
>> > > My concern with microvm is that it's so limited that few users
>> > > will be
>> > > able to benefit from the reduced attack surface and faster
>> > > startup time.
>> > > I think it's worth investigating slimming down Q35 further first.
>> > > 
>> > > In terms of startup time the first step would be profiling Q35
>> > > kernel
>> > > startup to find out what's taking so long (firmware
>> > > initialization, PCI
>> > > probing, etc)?
>> > 
>> > Some findings:
>> > 
>> >  1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host")
>> > saves a
>> >     whooping 120ms by avoiding the APIC timer calibration at
>> >     arch/x86/kernel/apic/apic.c:calibrate_APIC_clock
>> > 
>> > Average boot time with "-cpu host"
>> >  qemu_init_end: 76.408950
>> >  linux_start_kernel: 116.166142 (+39.757192)
>> >  linux_start_user: 242.954347 (+126.788205)
>> > 
>> > Average boot time with default "cpu"
>> >  qemu_init_end: 77.467852
>> >  linux_start_kernel: 116.688472 (+39.22062)
>> >  linux_start_user: 363.033365 (+246.344893)
>> 
>> \o/
>> 
>> >  2. The other 130ms are a direct result of PCI and ACPI presence
>> > (tested
>> >     with a kernel without support for those elements). I'll publish
>> > some
>> >     detailed numbers next week.
>> 
>> Here are the Kata Containers kernel parameters:
>> 
>> var kernelParams = []Param{
>>         {"tsc", "reliable"},
>>         {"no_timer_check", ""},
>>         {"rcupdate.rcu_expedited", "1"},
>>         {"i8042.direct", "1"},
>>         {"i8042.dumbkbd", "1"},
>>         {"i8042.nopnp", "1"},
>>         {"i8042.noaux", "1"},
>>         {"noreplace-smp", ""},
>>         {"reboot", "k"},
>>         {"console", "hvc0"},
>>         {"console", "hvc1"},
>>         {"iommu", "off"},
>>         {"cryptomgr.notests", ""},
>>         {"net.ifnames", "0"},
>>         {"pci", "lastbus=0"},
>> }
>> 
>> pci lastbus=0 looks interesting and so do some of the others :).
>> 
>
> yeah, pci=lastbus=0 is very helpful to reduce the boot time in q35,
> kernel won't scan the 255.. buses :)

I can confirm that adding pci=lastbus=0 makes a significant
improvement. In fact, is the only option from Kata's kernel parameter
list that has an impact, probably because the kernel is already quite
minimalistic.

Average boot time with "-cpu host" and "pci=lastbus=0"
 qemu_init_end: 73.711569
 linux_start_kernel: 113.414311 (+39.702742)
 linux_start_user: 190.949939 (+77.535628)

That's still ~40% slower than microvm, and the breach quickly widens
when adding more PCI devices (each one adds 10-15ms), but it's certainly
an improvement over the original numbers.

On the other hand, there isn't much we can do here from QEMU's
perspective, as this is basically Guest OS tuning.

Sergio.

Attachment: signature.asc
Description: PGP signature


reply via email to

[Prev in Thread] Current Thread [Next in Thread]