qemu-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type


From: Stefano Garzarella
Subject: Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type
Date: Wed, 24 Jul 2019 17:23:28 +0200
User-agent: NeoMutt/20180716


On Tue, Jul 23, 2019 at 1:30 PM Stefano Garzarella <address@hidden> wrote:
>
> On Tue, Jul 23, 2019 at 10:47:39AM +0100, Stefan Hajnoczi wrote:
> > On Tue, Jul 23, 2019 at 9:43 AM Sergio Lopez <address@hidden> wrote:
> > > Montes, Julio <address@hidden> writes:
> > >
> > > > On Fri, 2019-07-19 at 16:09 +0100, Stefan Hajnoczi wrote:
> > > >> On Fri, Jul 19, 2019 at 2:48 PM Sergio Lopez <address@hidden> wrote:
> > > >> > Stefan Hajnoczi <address@hidden> writes:
> > > >> > > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote:
> > > >> > > > Stefan Hajnoczi <address@hidden> writes:
> > > >> > > >
> > > >> > > > > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote:
> > > >> > > >  --------------
> > > >> > > >  | Conclusion |
> > > >> > > >  --------------
> > > >> > > >
> > > >> > > > The average boot time of microvm is a third of Q35's (115ms vs.
> > > >> > > > 363ms),
> > > >> > > > and is smaller on all sections (QEMU initialization, firmware
> > > >> > > > overhead
> > > >> > > > and kernel start-to-user).
> > > >> > > >
> > > >> > > > Microvm's memory tree is also visibly simpler, significantly
> > > >> > > > reducing
> > > >> > > > the exposed surface to the guest.
> > > >> > > >
> > > >> > > > While we can certainly work on making Q35 smaller, I definitely
> > > >> > > > think
> > > >> > > > it's better (and way safer!) having a specialized machine type
> > > >> > > > for a
> > > >> > > > specific use case, than a minimal Q35 whose behavior
> > > >> > > > significantly
> > > >> > > > diverges from a conventional Q35.
> > > >> > >
> > > >> > > Interesting, so not a 10x difference!  This might be amenable to
> > > >> > > optimization.
> > > >> > >
> > > >> > > My concern with microvm is that it's so limited that few users
> > > >> > > will be
> > > >> > > able to benefit from the reduced attack surface and faster
> > > >> > > startup time.
> > > >> > > I think it's worth investigating slimming down Q35 further first.
> > > >> > >
> > > >> > > In terms of startup time the first step would be profiling Q35
> > > >> > > kernel
> > > >> > > startup to find out what's taking so long (firmware
> > > >> > > initialization, PCI
> > > >> > > probing, etc)?
> > > >> >
> > > >> > Some findings:
> > > >> >
> > > >> >  1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host")
> > > >> > saves a
> > > >> >     whooping 120ms by avoiding the APIC timer calibration at
> > > >> >     arch/x86/kernel/apic/apic.c:calibrate_APIC_clock
> > > >> >
> > > >> > Average boot time with "-cpu host"
> > > >> >  qemu_init_end: 76.408950
> > > >> >  linux_start_kernel: 116.166142 (+39.757192)
> > > >> >  linux_start_user: 242.954347 (+126.788205)
> > > >> >
> > > >> > Average boot time with default "cpu"
> > > >> >  qemu_init_end: 77.467852
> > > >> >  linux_start_kernel: 116.688472 (+39.22062)
> > > >> >  linux_start_user: 363.033365 (+246.344893)
> > > >>
> > > >> \o/
> > > >>
> > > >> >  2. The other 130ms are a direct result of PCI and ACPI presence
> > > >> > (tested
> > > >> >     with a kernel without support for those elements). I'll publish
> > > >> > some
> > > >> >     detailed numbers next week.
> > > >>
> > > >> Here are the Kata Containers kernel parameters:
> > > >>
> > > >> var kernelParams = []Param{
> > > >>         {"tsc", "reliable"},
> > > >>         {"no_timer_check", ""},
> > > >>         {"rcupdate.rcu_expedited", "1"},
> > > >>         {"i8042.direct", "1"},
> > > >>         {"i8042.dumbkbd", "1"},
> > > >>         {"i8042.nopnp", "1"},
> > > >>         {"i8042.noaux", "1"},
> > > >>         {"noreplace-smp", ""},
> > > >>         {"reboot", "k"},
> > > >>         {"console", "hvc0"},
> > > >>         {"console", "hvc1"},
> > > >>         {"iommu", "off"},
> > > >>         {"cryptomgr.notests", ""},
> > > >>         {"net.ifnames", "0"},
> > > >>         {"pci", "lastbus=0"},
> > > >> }
> > > >>
> > > >> pci lastbus=0 looks interesting and so do some of the others :).
> > > >>
> > > >
> > > > yeah, pci=lastbus=0 is very helpful to reduce the boot time in q35,
> > > > kernel won't scan the 255.. buses :)
> > >
> > > I can confirm that adding pci=lastbus=0 makes a significant
> > > improvement. In fact, is the only option from Kata's kernel parameter
> > > list that has an impact, probably because the kernel is already quite
> > > minimalistic.
> > >
> > > Average boot time with "-cpu host" and "pci=lastbus=0"
> > >  qemu_init_end: 73.711569
> > >  linux_start_kernel: 113.414311 (+39.702742)
> > >  linux_start_user: 190.949939 (+77.535628)
> > >
> > > That's still ~40% slower than microvm, and the breach quickly widens
> > > when adding more PCI devices (each one adds 10-15ms), but it's certainly
> > > an improvement over the original numbers.
> > >
> > > On the other hand, there isn't much we can do here from QEMU's
> > > perspective, as this is basically Guest OS tuning.
> >
> > fw_cfg could expose this information so guest kernels know when to
> > stop enumerating the PCI bus.  This would make all PCI guests with new
> > kernels boot ~50 ms faster, regardless of machine type.
> >
> > The difference between microvm and tuned Q35 is 76 ms now.
> >
> > microvm:
> > qemu_init_end: 64.043264
> > linux_start_kernel: 65.481782 (+1.438518)
> > linux_start_user: 114.938353 (+49.456571)
> >
> > Q35 with -cpu host and pci=lasbus=0:
> > qemu_init_end: 73.711569
> > linux_start_kernel: 113.414311 (+39.702742)
> > linux_start_user: 190.949939 (+77.535628)
> >
> > There is a ~39 ms difference before linux_start_kernel.  SeaBIOS is
> > loading the PVH Option ROM.
> >
> > Stefano: any recommendations for profiling or tuning SeaBIOS?
>
> As I said on IRC, the SeaBIOS image in QEMU is the 1.12.1 and it doesn't
> include this patch (available in the upstream SeaBIOS) that saves ~10ms:
>
>     commit 75b42835134553c96f113e5014072c0caf99d092
>     Author: Stefano Garzarella <address@hidden>
>     Date:   Sun Dec 2 14:10:13 2018 +0100
>
>         qemu: avoid debug prints if debugcon is not enabled
>
>         In order to speed up the boot phase, we can check the QEMU
>         debugcon device, and disable the writes if it is not recognized.
>
>         This patch allow us to save around 10 msec (time measured
>         between SeaBIOS entry point and "linuxboot" entry point)
>         when CONFIG_DEBUG_LEVEL=1 and debugcon is not enabled.
>
>         Signed-off-by: Stefano Garzarella <address@hidden>
>         Signed-off-by: Kevin O'Connor <address@hidden>
>
> As you said, we should update SeaBIOS for the next QEMU release.
>
> For profiling, I have some patches that I used to put trace points in
> the SeaBIOS code. I'll put them in this repository ASAP:
>     https://github.com/stefano-garzarella/qemu-boot-time

I pushed QEMU (optionrom) and SeaBIOS patches in:
https://github.com/stefano-garzarella/qemu-boot-time
They can be useful for profiling.

Cheers,
Stefano



reply via email to

[Prev in Thread] Current Thread [Next in Thread]