All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
@ 2018-11-26 16:40 Stefano Garzarella
  2018-11-27  9:57 ` Rob Bradford
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Garzarella @ 2018-11-26 16:40 UTC (permalink / raw)
  To: Samuel Ortiz, Rob Bradford; +Cc: qemu-devel, Stefan Hajnoczi

Hi Samuel, Rob,
I'm proceeding to compare several x86 firmwares in order to understand
which suits better with -kernel option to have a fast boot with
QEMU/NEMU.

For your use cases, what boot time do you expect?


I compared SeaBIOS, qboot, and OVMF. I started each test using this
qemu parameters: "./qemu-system-x86_64 -bios path/to/bios.bin -m 1G
-cpu host -M accel=kvm -vga none -kernel path/to/bzImage -initrd
path/to/rootfs.cpio ..."

As Samuel suggested, I added the total time to userspace adding a
probe in the kernel_init(), so the times (in msec) that I measured
are:
- qemu_init_end: first kvm_entry (i.e. QEMU initialized has finished)
- fw_start: first entry of the firmware
- fw_do_boot: after the firmware initialization (e.g. PCI setup, etc.)
- linux_start_boot: before the jump to the Linux kernel
- linux_start_user: before starting the init process

* SeaBIOS
Default configuration without debug messages (CONFIG_DEBUG_LEVEL=0) +
Stephen's patch (tpm: Check for TPM related ACPI tables before
attempting hw) + my patch (qemu: fast boot when linuxboot optionrom is
used).
 qemu_init_end: 41.634812
 fw_start: 41.857374 (+0.222562)
 fw_do_boot: 52.754109 (+10.896735)
 linux_start_boot: 54.117220 (+1.363111)
 linux_start_user: 495.684199 (+441.566979)

* qboot
Default configuration + my patch (pci: reduce pci_foreach() calls).
 qemu_init_end: 40.233717
 fw_start: 40.384048 (+0.150331)
 fw_do_boot: 45.660497 (+5.276449)
 linux_start_boot: 47.252119 (+1.591622)
 linux_start_user: 509.173886 (+461.921767)

* OVMF (https://github.com/intel/ovmf-virt)
I followed this script
(https://github.com/intel/nemu/blob/topic/virt-x86/tools/CI/run_nats.sh)
to build OVMF.
Note: I put the "fw_start" probe in the BdsEntry()
[MdeModulePkg/Universal/BdsDxe/BdsEntry.c], I'm not sure if it is
"near" to the real entry point.
 qemu_init_end: 42.734555
 fw_start: 163.611506 (+120.876951)
 fw_do_boot: 369.713760 (+206.102254)
 linux_start_boot: 370.960364 (+1.246604)
 linux_start_user: 796.799667 (+425.839303)

For OVMF case, are reasonable the times that I measured? Do you use a
different configuration?

Thanks,
Stefano

-- 
Stefano Garzarella
Red Hat

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-11-26 16:40 [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares Stefano Garzarella
@ 2018-11-27  9:57 ` Rob Bradford
  2018-11-27 14:21   ` Stefano Garzarella
  0 siblings, 1 reply; 18+ messages in thread
From: Rob Bradford @ 2018-11-27  9:57 UTC (permalink / raw)
  To: Stefano Garzarella, Samuel Ortiz; +Cc: qemu-devel, Stefan Hajnoczi

Hi Stefano,

On Mon, 2018-11-26 at 17:40 +0100, Stefano Garzarella wrote:
> Hi Samuel, Rob,
> I'm proceeding to compare several x86 firmwares in order to
> understand
> which suits better with -kernel option to have a fast boot with
> QEMU/NEMU.
> 
> For your use cases, what boot time do you expect?
> 

Kata Containers, one of the use cases we're looking at, is very
sensitive to boot time, although it is configurable, the default
configuration is to use qemu-lite with it's direct to kernel loader.

That is probably a baseline we should be comparing with.

As an aside, yesterday I posted the Seabios patches for enabling on
NEMU/virt:

https://github.com/rbradford/seabios/tree/virt-x86

I didn't yet start optimising the configuration file we use for it
beyond getting it < 128KiB or apply any optimisation patches.

And I also have some qboot patches too:

https://github.com/rbradford/qboot/tree/virt-x86

So we're in a situation where we can start comparing other machine
types against virt too.

> 
> I compared SeaBIOS, qboot, and OVMF. I started each test using this
> qemu parameters: "./qemu-system-x86_64 -bios path/to/bios.bin -m 1G
> -cpu host -M accel=kvm -vga none -kernel path/to/bzImage -initrd
> path/to/rootfs.cpio ..."
> 
> As Samuel suggested, I added the total time to userspace adding a
> probe in the kernel_init(), so the times (in msec) that I measured
> are:
> - qemu_init_end: first kvm_entry (i.e. QEMU initialized has finished)
> - fw_start: first entry of the firmware
> - fw_do_boot: after the firmware initialization (e.g. PCI setup,
> etc.)
> - linux_start_boot: before the jump to the Linux kernel
> - linux_start_user: before starting the init process
> 
> * SeaBIOS
> Default configuration without debug messages (CONFIG_DEBUG_LEVEL=0) +
> Stephen's patch (tpm: Check for TPM related ACPI tables before
> attempting hw) + my patch (qemu: fast boot when linuxboot optionrom
> is
> used).
>  qemu_init_end: 41.634812
>  fw_start: 41.857374 (+0.222562)
>  fw_do_boot: 52.754109 (+10.896735)
>  linux_start_boot: 54.117220 (+1.363111)
>  linux_start_user: 495.684199 (+441.566979)
> 
> * qboot
> Default configuration + my patch (pci: reduce pci_foreach() calls).
>  qemu_init_end: 40.233717
>  fw_start: 40.384048 (+0.150331)
>  fw_do_boot: 45.660497 (+5.276449)
>  linux_start_boot: 47.252119 (+1.591622)
>  linux_start_user: 509.173886 (+461.921767)
> 
> * OVMF (https://github.com/intel/ovmf-virt)
> I followed this script
> (
> https://github.com/intel/nemu/blob/topic/virt-x86/tools/CI/run_nats.sh
> )
> to build OVMF.
> Note: I put the "fw_start" probe in the BdsEntry()
> [MdeModulePkg/Universal/BdsDxe/BdsEntry.c], I'm not sure if it is
> "near" to the real entry point.
>  qemu_init_end: 42.734555
>  fw_start: 163.611506 (+120.876951)
>  fw_do_boot: 369.713760 (+206.102254)
>  linux_start_boot: 370.960364 (+1.246604)
>  linux_start_user: 796.799667 (+425.839303)
> 
> For OVMF case, are reasonable the times that I measured? Do you use a
> different configuration?
> 

That is the same OVMF configuration as we normally use, I have the
configuration file checked in. Those numbers look like the ones I
expect.

> Thanks,
> Stefano
> 

Great work,

Rob

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-11-27  9:57 ` Rob Bradford
@ 2018-11-27 14:21   ` Stefano Garzarella
  2018-12-03 14:27     ` Stefano Garzarella
  0 siblings, 1 reply; 18+ messages in thread
From: Stefano Garzarella @ 2018-11-27 14:21 UTC (permalink / raw)
  To: Rob Bradford; +Cc: Samuel Ortiz, qemu-devel, Stefan Hajnoczi

Hi Rob,

On Tue, Nov 27, 2018 at 10:57 AM Rob Bradford <robert.bradford@intel.com> wrote:
>
> Hi Stefano,
>
> On Mon, 2018-11-26 at 17:40 +0100, Stefano Garzarella wrote:
> > Hi Samuel, Rob,
> > I'm proceeding to compare several x86 firmwares in order to
> > understand
> > which suits better with -kernel option to have a fast boot with
> > QEMU/NEMU.
> >
> > For your use cases, what boot time do you expect?
> >
>
> Kata Containers, one of the use cases we're looking at, is very
> sensitive to boot time, although it is configurable, the default
> configuration is to use qemu-lite with it's direct to kernel loader.
>
> That is probably a baseline we should be comparing with.
>
> As an aside, yesterday I posted the Seabios patches for enabling on
> NEMU/virt:
>
> https://github.com/rbradford/seabios/tree/virt-x86
>
> I didn't yet start optimising the configuration file we use for it
> beyond getting it < 128KiB or apply any optimisation patches.
>
> And I also have some qboot patches too:
>
> https://github.com/rbradford/qboot/tree/virt-x86
>
> So we're in a situation where we can start comparing other machine
> types against virt too.

Thanks for the pointers, I'll try to do the same tests with qemu-lite
and direct kernel loader and also with NEMU "virt" machine, using your
versions of SeaBIOS and qboot.

>
> >
> > I compared SeaBIOS, qboot, and OVMF. I started each test using this
> > qemu parameters: "./qemu-system-x86_64 -bios path/to/bios.bin -m 1G
> > -cpu host -M accel=kvm -vga none -kernel path/to/bzImage -initrd
> > path/to/rootfs.cpio ..."
> >
> > As Samuel suggested, I added the total time to userspace adding a
> > probe in the kernel_init(), so the times (in msec) that I measured
> > are:
> > - qemu_init_end: first kvm_entry (i.e. QEMU initialized has finished)
> > - fw_start: first entry of the firmware
> > - fw_do_boot: after the firmware initialization (e.g. PCI setup,
> > etc.)
> > - linux_start_boot: before the jump to the Linux kernel
> > - linux_start_user: before starting the init process
> >
> > * SeaBIOS
> > Default configuration without debug messages (CONFIG_DEBUG_LEVEL=0) +
> > Stephen's patch (tpm: Check for TPM related ACPI tables before
> > attempting hw) + my patch (qemu: fast boot when linuxboot optionrom
> > is
> > used).
> >  qemu_init_end: 41.634812
> >  fw_start: 41.857374 (+0.222562)
> >  fw_do_boot: 52.754109 (+10.896735)
> >  linux_start_boot: 54.117220 (+1.363111)
> >  linux_start_user: 495.684199 (+441.566979)
> >
> > * qboot
> > Default configuration + my patch (pci: reduce pci_foreach() calls).
> >  qemu_init_end: 40.233717
> >  fw_start: 40.384048 (+0.150331)
> >  fw_do_boot: 45.660497 (+5.276449)
> >  linux_start_boot: 47.252119 (+1.591622)
> >  linux_start_user: 509.173886 (+461.921767)
> >
> > * OVMF (https://github.com/intel/ovmf-virt)
> > I followed this script
> > (
> > https://github.com/intel/nemu/blob/topic/virt-x86/tools/CI/run_nats.sh
> > )
> > to build OVMF.
> > Note: I put the "fw_start" probe in the BdsEntry()
> > [MdeModulePkg/Universal/BdsDxe/BdsEntry.c], I'm not sure if it is
> > "near" to the real entry point.
> >  qemu_init_end: 42.734555
> >  fw_start: 163.611506 (+120.876951)
> >  fw_do_boot: 369.713760 (+206.102254)
> >  linux_start_boot: 370.960364 (+1.246604)
> >  linux_start_user: 796.799667 (+425.839303)
> >
> > For OVMF case, are reasonable the times that I measured? Do you use a
> > different configuration?
> >
>
> That is the same OVMF configuration as we normally use, I have the
> configuration file checked in. Those numbers look like the ones I
> expect.

Perfect!

Please, let me know if you have something more important than boot
time to help you.

Regards,
Stefano

>
> > Thanks,
> > Stefano
> >
>
> Great work,
>
> Rob
>

--
Stefano Garzarella
Red Hat

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-11-27 14:21   ` Stefano Garzarella
@ 2018-12-03 14:27     ` Stefano Garzarella
  2018-12-03 15:44       ` Rob Bradford
  2018-12-13 11:19       ` Stefano Garzarella
  0 siblings, 2 replies; 18+ messages in thread
From: Stefano Garzarella @ 2018-12-03 14:27 UTC (permalink / raw)
  To: Rob Bradford; +Cc: Samuel Ortiz, qemu-devel, Stefan Hajnoczi

Hi Rob,
I continued to investigate the boot time, and as you suggested I
looked also at qemu-lite 2.11.2
(https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
did the following tests using the Kata kernel configuration
(https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x)

To compare the results with qemu-lite direct kernel load, I added
another tracepoint:
- linux_start_kernel: first entry of the Linux kernel (start_kernel())

As you can see, NEMU is faster to jump to the kernel
(linux_start_kernel) than qemu-lite when uses qboot or seabios with
virt support, but the time to the user space is strangely high, maybe
the kernel configuration that I used is not the best one.
Do you suggest another kernel configuration?

Anyway, I obtained the best boot time with qemu-lite and direct kernel
load (vmlinux ELF image). I think because the kernel was not
compressed. Indeed, looking to the others test, the kernel
decompression (bzImage) takes about 80 ms (linux_start_kernel -
linux_start_boot). (I'll investigate better)

* QEMU
    + qboot
      qemu_init_end: 41.306968
      fw_start: 41.478601 (+0.171633)
      fw_do_boot: 47.03688 (+5.558279)
      linux_start_boot: 48.80918 (+1.7723)
      linux_start_kernel: 128.474013 (+79.664833)
      linux_start_user: 347.257097 (+218.783084)
    + seabios
      qemu_init_end: 40.703177
      fw_start: 40.902842 (+0.199665)
      fw_do_boot: 54.604878 (+13.702036)
      linux_start_boot: 56.314063 (+1.709185)
      linux_start_kernel: 136.942521 (+80.628458)
      linux_start_user: 356.211221 (+219.2687)
    + OVMF
      qemu_init_end: 44.055712
      fw_start: 168.094131 (+124.038419)
      fw_do_boot: 322.222745 (+154.128614)
      linux_start_boot: 323.809639 (+1.586894)
      linux_start_kernel: 401.756773 (+77.947134)
      linux_start_user: 604.601438 (+202.844665)
* NEMU "virt" machine
    + qboot
      qemu_init_end: 27.732925
      fw_start: 27.887483 (+0.154558)
      fw_do_boot: 29.20226 (+1.314777)
      linux_start_boot: 30.842672 (+1.640412)
      linux_start_kernel: 111.639445 (+80.796773)
      linux_start_user: 862.294876 (+750.655431)
    + seabios
      qemu_init_end: 28.203907
      fw_start: 28.68285 (+0.478943)
      fw_do_boot: 36.337377 (+7.654527)
      linux_start_boot: 37.995622 (+1.658245)
      linux_start_kernel: 118.11591 (+80.120288)
      linux_start_user: 897.110477 (+778.994567)
    + OVMF
      qemu_init_end: 31.219852
      fw_start: 153.304126 (+122.084274)
      fw_do_boot: 183.69459 (+30.390464)
      linux_start_boot: 185.296567 (+1.601977)
      linux_start_kernel: 361.11524 (+175.818673)
      linux_start_user: 1231.590077 (+870.474837)
* Qemu-lite
    + no FW (-machine pc,...,fw=off KERNEL=/path/to/vmlinux)
      qemu_init_end: 32.243801
      linux_start_kernel: 32.330323 (+0.086522)
      linux_start_user: 208.742712 (+176.412389)
    + qboot
      qemu_init_end: 25.393047
      fw_start: 25.545285 (+0.152238)
      fw_do_boot: 31.073253 (+5.527968)
      linux_start_boot: 33.139721 (+2.066468)
      linux_start_kernel: 114.604384 (+81.464663)
      linux_start_user: 287.93128 (+173.326896)
    + seabios
      qemu_init_end: 25.448283
      fw_start: 25.628977 (+0.180694)
      fw_do_boot: 38.551957 (+12.92298)
      linux_start_boot: 40.220319 (+1.668362)
      linux_start_kernel: 126.392991 (+86.172672)
      linux_start_user: 308.807806 (+182.414815)
    + OVMF
      qemu_init_end: 27.616988
      fw_start: 151.272286 (+123.655298)
      fw_do_boot: 309.068926 (+157.79664)
      linux_start_boot: 310.602074 (+1.533148)
      linux_start_kernel: 387.83349 (+77.231416)
      linux_start_user: 576.057453 (+188.223963)

Testbed:
* seabios: https://github.com/rbradford/seabios/tree/virt-x86 + patch
to disable debug writes
(https://mail.coreboot.org/pipermail/seabios/2018-December/012654.html)
* qboot: https://github.com/rbradford/qboot/tree/virt-x86
* OVMF: https://github.com/intel/ovmf-virt/tree/virt-x86
* QEMU command: $QEMU -bios $BIOS -machine
$MACHINE,accel=kvm,kernel_irqchip,nvdimm -cpu host -m
512,maxmem=1G,slots=1 -smp 1 -kernel $KERNEL -append 'root=/dev/pmem0
ro console=hvc0' -vga none -display none -no-user-config -nodefaults
-object memory-backend-file,id=mem0,share,mem-path=$ROOTFS,size=$ROOTFS_SIZE
-device nvdimm,memdev=mem0,id=nv0 -chardev stdio,id=virtiocon0 -device
virtio-serial -device virtconsole,chardev=virtiocon0

Cheers,
Stefano

On Tue, Nov 27, 2018 at 3:21 PM Stefano Garzarella <sgarzare@redhat.com> wrote:
>
> Hi Rob,
>
> On Tue, Nov 27, 2018 at 10:57 AM Rob Bradford <robert.bradford@intel.com> wrote:
> >
> > Hi Stefano,
> >
> > On Mon, 2018-11-26 at 17:40 +0100, Stefano Garzarella wrote:
> > > Hi Samuel, Rob,
> > > I'm proceeding to compare several x86 firmwares in order to
> > > understand
> > > which suits better with -kernel option to have a fast boot with
> > > QEMU/NEMU.
> > >
> > > For your use cases, what boot time do you expect?
> > >
> >
> > Kata Containers, one of the use cases we're looking at, is very
> > sensitive to boot time, although it is configurable, the default
> > configuration is to use qemu-lite with it's direct to kernel loader.
> >
> > That is probably a baseline we should be comparing with.
> >
> > As an aside, yesterday I posted the Seabios patches for enabling on
> > NEMU/virt:
> >
> > https://github.com/rbradford/seabios/tree/virt-x86
> >
> > I didn't yet start optimising the configuration file we use for it
> > beyond getting it < 128KiB or apply any optimisation patches.
> >
> > And I also have some qboot patches too:
> >
> > https://github.com/rbradford/qboot/tree/virt-x86
> >
> > So we're in a situation where we can start comparing other machine
> > types against virt too.
>
> Thanks for the pointers, I'll try to do the same tests with qemu-lite
> and direct kernel loader and also with NEMU "virt" machine, using your
> versions of SeaBIOS and qboot.
>
> >
> > >
> > > I compared SeaBIOS, qboot, and OVMF. I started each test using this
> > > qemu parameters: "./qemu-system-x86_64 -bios path/to/bios.bin -m 1G
> > > -cpu host -M accel=kvm -vga none -kernel path/to/bzImage -initrd
> > > path/to/rootfs.cpio ..."
> > >
> > > As Samuel suggested, I added the total time to userspace adding a
> > > probe in the kernel_init(), so the times (in msec) that I measured
> > > are:
> > > - qemu_init_end: first kvm_entry (i.e. QEMU initialized has finished)
> > > - fw_start: first entry of the firmware
> > > - fw_do_boot: after the firmware initialization (e.g. PCI setup,
> > > etc.)
> > > - linux_start_boot: before the jump to the Linux kernel
> > > - linux_start_user: before starting the init process
> > >
> > > * SeaBIOS
> > > Default configuration without debug messages (CONFIG_DEBUG_LEVEL=0) +
> > > Stephen's patch (tpm: Check for TPM related ACPI tables before
> > > attempting hw) + my patch (qemu: fast boot when linuxboot optionrom
> > > is
> > > used).
> > >  qemu_init_end: 41.634812
> > >  fw_start: 41.857374 (+0.222562)
> > >  fw_do_boot: 52.754109 (+10.896735)
> > >  linux_start_boot: 54.117220 (+1.363111)
> > >  linux_start_user: 495.684199 (+441.566979)
> > >
> > > * qboot
> > > Default configuration + my patch (pci: reduce pci_foreach() calls).
> > >  qemu_init_end: 40.233717
> > >  fw_start: 40.384048 (+0.150331)
> > >  fw_do_boot: 45.660497 (+5.276449)
> > >  linux_start_boot: 47.252119 (+1.591622)
> > >  linux_start_user: 509.173886 (+461.921767)
> > >
> > > * OVMF (https://github.com/intel/ovmf-virt)
> > > I followed this script
> > > (
> > > https://github.com/intel/nemu/blob/topic/virt-x86/tools/CI/run_nats.sh
> > > )
> > > to build OVMF.
> > > Note: I put the "fw_start" probe in the BdsEntry()
> > > [MdeModulePkg/Universal/BdsDxe/BdsEntry.c], I'm not sure if it is
> > > "near" to the real entry point.
> > >  qemu_init_end: 42.734555
> > >  fw_start: 163.611506 (+120.876951)
> > >  fw_do_boot: 369.713760 (+206.102254)
> > >  linux_start_boot: 370.960364 (+1.246604)
> > >  linux_start_user: 796.799667 (+425.839303)
> > >
> > > For OVMF case, are reasonable the times that I measured? Do you use a
> > > different configuration?
> > >
> >
> > That is the same OVMF configuration as we normally use, I have the
> > configuration file checked in. Those numbers look like the ones I
> > expect.
>
> Perfect!
>
> Please, let me know if you have something more important than boot
> time to help you.
>
> Regards,
> Stefano
>
> >
> > > Thanks,
> > > Stefano
> > >
> >
> > Great work,
> >
> > Rob
> >
>
> --
> Stefano Garzarella
> Red Hat



--
Stefano Garzarella
Red Hat

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-03 14:27     ` Stefano Garzarella
@ 2018-12-03 15:44       ` Rob Bradford
  2018-12-03 16:35         ` Stefano Garzarella
  2018-12-13 11:19       ` Stefano Garzarella
  1 sibling, 1 reply; 18+ messages in thread
From: Rob Bradford @ 2018-12-03 15:44 UTC (permalink / raw)
  To: Stefano Garzarella; +Cc: Samuel Ortiz, qemu-devel, Stefan Hajnoczi

Hi Stefano, thanks for capturing all these numbers,

On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
> Hi Rob,
> I continued to investigate the boot time, and as you suggested I
> looked also at qemu-lite 2.11.2
> (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> did the following tests using the Kata kernel configuration
> (
> https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
> )
> 
> To compare the results with qemu-lite direct kernel load, I added
> another tracepoint:
> - linux_start_kernel: first entry of the Linux kernel
> (start_kernel())
> 

Great, do you have a set of patches available that all these trace
points. It would be great for reproduction.

> As you can see, NEMU is faster to jump to the kernel
> (linux_start_kernel) than qemu-lite when uses qboot or seabios with
> virt support, but the time to the user space is strangely high, maybe
> the kernel configuration that I used is not the best one.
> Do you suggest another kernel configuration?
> 

This looks very bad. This isn't the kernel configuration we normally
test with in our automated test system but is definitely one we support
as part of our partnernship with the Kata team. It's a high priority
for me to try and investigate that. Have you saved the kernel messages
as they might be helpful?

> Anyway, I obtained the best boot time with qemu-lite and direct
> kernel
> load (vmlinux ELF image). I think because the kernel was not
> compressed. Indeed, looking to the others test, the kernel
> decompression (bzImage) takes about 80 ms (linux_start_kernel -
> linux_start_boot). (I'll investigate better)
> 

Yup being able to load an uncompressed kernel is one of the big
advantages of qemu-lite. I wonder if we could bring that feature into
qemu itself to supplement the existing firmware based kernel loading.

> * QEMU
>     + qboot
>       qemu_init_end: 41.306968
>       fw_start: 41.478601 (+0.171633)
>       fw_do_boot: 47.03688 (+5.558279)
>       linux_start_boot: 48.80918 (+1.7723)
>       linux_start_kernel: 128.474013 (+79.664833)
>       linux_start_user: 347.257097 (+218.783084)
>     + seabios
>       qemu_init_end: 40.703177
>       fw_start: 40.902842 (+0.199665)
>       fw_do_boot: 54.604878 (+13.702036)
>       linux_start_boot: 56.314063 (+1.709185)
>       linux_start_kernel: 136.942521 (+80.628458)
>       linux_start_user: 356.211221 (+219.2687)
>     + OVMF
>       qemu_init_end: 44.055712
>       fw_start: 168.094131 (+124.038419)
>       fw_do_boot: 322.222745 (+154.128614)
>       linux_start_boot: 323.809639 (+1.586894)
>       linux_start_kernel: 401.756773 (+77.947134)
>       linux_start_user: 604.601438 (+202.844665)
> * NEMU "virt" machine
>     + qboot
>       qemu_init_end: 27.732925
>       fw_start: 27.887483 (+0.154558)
>       fw_do_boot: 29.20226 (+1.314777)
>       linux_start_boot: 30.842672 (+1.640412)
>       linux_start_kernel: 111.639445 (+80.796773)
>       linux_start_user: 862.294876 (+750.655431)
>     + seabios
>       qemu_init_end: 28.203907
>       fw_start: 28.68285 (+0.478943)
>       fw_do_boot: 36.337377 (+7.654527)
>       linux_start_boot: 37.995622 (+1.658245)
>       linux_start_kernel: 118.11591 (+80.120288)
>       linux_start_user: 897.110477 (+778.994567)
>     + OVMF
>       qemu_init_end: 31.219852
>       fw_start: 153.304126 (+122.084274)
>       fw_do_boot: 183.69459 (+30.390464)
>       linux_start_boot: 185.296567 (+1.601977)
>       linux_start_kernel: 361.11524 (+175.818673)
>       linux_start_user: 1231.590077 (+870.474837)
> * Qemu-lite
>     + no FW (-machine pc,...,fw=off KERNEL=/path/to/vmlinux)
>       qemu_init_end: 32.243801
>       linux_start_kernel: 32.330323 (+0.086522)
>       linux_start_user: 208.742712 (+176.412389)
>     + qboot
>       qemu_init_end: 25.393047
>       fw_start: 25.545285 (+0.152238)
>       fw_do_boot: 31.073253 (+5.527968)
>       linux_start_boot: 33.139721 (+2.066468)
>       linux_start_kernel: 114.604384 (+81.464663)
>       linux_start_user: 287.93128 (+173.326896)
>     + seabios
>       qemu_init_end: 25.448283
>       fw_start: 25.628977 (+0.180694)
>       fw_do_boot: 38.551957 (+12.92298)
>       linux_start_boot: 40.220319 (+1.668362)
>       linux_start_kernel: 126.392991 (+86.172672)
>       linux_start_user: 308.807806 (+182.414815)
>     + OVMF
>       qemu_init_end: 27.616988
>       fw_start: 151.272286 (+123.655298)
>       fw_do_boot: 309.068926 (+157.79664)
>       linux_start_boot: 310.602074 (+1.533148)
>       linux_start_kernel: 387.83349 (+77.231416)
>       linux_start_user: 576.057453 (+188.223963)
> 
> Testbed:
> * seabios: https://github.com/rbradford/seabios/tree/virt-x86 + patch
> to disable debug writes
> (
> https://mail.coreboot.org/pipermail/seabios/2018-December/012654.html
> )
> * qboot: https://github.com/rbradford/qboot/tree/virt-x86
> * OVMF: https://github.com/intel/ovmf-virt/tree/virt-x86
> * QEMU command: $QEMU -bios $BIOS -machine
> $MACHINE,accel=kvm,kernel_irqchip,nvdimm -cpu host -m
> 512,maxmem=1G,slots=1 -smp 1 -kernel $KERNEL -append 'root=/dev/pmem0
> ro console=hvc0' -vga none -display none -no-user-config -nodefaults
> -object memory-backend-file,id=mem0,share,mem-
> path=$ROOTFS,size=$ROOTFS_SIZE
> -device nvdimm,memdev=mem0,id=nv0 -chardev stdio,id=virtiocon0
> -device
> virtio-serial -device virtconsole,chardev=virtiocon0
> 

Cheers,
Rob

> Cheers,
> Stefano
> 
> On Tue, Nov 27, 2018 at 3:21 PM Stefano Garzarella <
> sgarzare@redhat.com> wrote:
> > 
> > Hi Rob,
> > 
> > On Tue, Nov 27, 2018 at 10:57 AM Rob Bradford <
> > robert.bradford@intel.com> wrote:
> > > 
> > > Hi Stefano,
> > > 
> > > On Mon, 2018-11-26 at 17:40 +0100, Stefano Garzarella wrote:
> > > > Hi Samuel, Rob,
> > > > I'm proceeding to compare several x86 firmwares in order to
> > > > understand
> > > > which suits better with -kernel option to have a fast boot with
> > > > QEMU/NEMU.
> > > > 
> > > > For your use cases, what boot time do you expect?
> > > > 
> > > 
> > > Kata Containers, one of the use cases we're looking at, is very
> > > sensitive to boot time, although it is configurable, the default
> > > configuration is to use qemu-lite with it's direct to kernel
> > > loader.
> > > 
> > > That is probably a baseline we should be comparing with.
> > > 
> > > As an aside, yesterday I posted the Seabios patches for enabling
> > > on
> > > NEMU/virt:
> > > 
> > > https://github.com/rbradford/seabios/tree/virt-x86
> > > 
> > > I didn't yet start optimising the configuration file we use for
> > > it
> > > beyond getting it < 128KiB or apply any optimisation patches.
> > > 
> > > And I also have some qboot patches too:
> > > 
> > > https://github.com/rbradford/qboot/tree/virt-x86
> > > 
> > > So we're in a situation where we can start comparing other
> > > machine
> > > types against virt too.
> > 
> > Thanks for the pointers, I'll try to do the same tests with qemu-
> > lite
> > and direct kernel loader and also with NEMU "virt" machine, using
> > your
> > versions of SeaBIOS and qboot.
> > 
> > > 
> > > > 
> > > > I compared SeaBIOS, qboot, and OVMF. I started each test using
> > > > this
> > > > qemu parameters: "./qemu-system-x86_64 -bios path/to/bios.bin
> > > > -m 1G
> > > > -cpu host -M accel=kvm -vga none -kernel path/to/bzImage
> > > > -initrd
> > > > path/to/rootfs.cpio ..."
> > > > 
> > > > As Samuel suggested, I added the total time to userspace adding
> > > > a
> > > > probe in the kernel_init(), so the times (in msec) that I
> > > > measured
> > > > are:
> > > > - qemu_init_end: first kvm_entry (i.e. QEMU initialized has
> > > > finished)
> > > > - fw_start: first entry of the firmware
> > > > - fw_do_boot: after the firmware initialization (e.g. PCI
> > > > setup,
> > > > etc.)
> > > > - linux_start_boot: before the jump to the Linux kernel
> > > > - linux_start_user: before starting the init process
> > > > 
> > > > * SeaBIOS
> > > > Default configuration without debug messages
> > > > (CONFIG_DEBUG_LEVEL=0) +
> > > > Stephen's patch (tpm: Check for TPM related ACPI tables before
> > > > attempting hw) + my patch (qemu: fast boot when linuxboot
> > > > optionrom
> > > > is
> > > > used).
> > > >  qemu_init_end: 41.634812
> > > >  fw_start: 41.857374 (+0.222562)
> > > >  fw_do_boot: 52.754109 (+10.896735)
> > > >  linux_start_boot: 54.117220 (+1.363111)
> > > >  linux_start_user: 495.684199 (+441.566979)
> > > > 
> > > > * qboot
> > > > Default configuration + my patch (pci: reduce pci_foreach()
> > > > calls).
> > > >  qemu_init_end: 40.233717
> > > >  fw_start: 40.384048 (+0.150331)
> > > >  fw_do_boot: 45.660497 (+5.276449)
> > > >  linux_start_boot: 47.252119 (+1.591622)
> > > >  linux_start_user: 509.173886 (+461.921767)
> > > > 
> > > > * OVMF (https://github.com/intel/ovmf-virt)
> > > > I followed this script
> > > > (
> > > > 
https://github.com/intel/nemu/blob/topic/virt-x86/tools/CI/run_nats.sh
> > > > )
> > > > to build OVMF.
> > > > Note: I put the "fw_start" probe in the BdsEntry()
> > > > [MdeModulePkg/Universal/BdsDxe/BdsEntry.c], I'm not sure if it
> > > > is
> > > > "near" to the real entry point.
> > > >  qemu_init_end: 42.734555
> > > >  fw_start: 163.611506 (+120.876951)
> > > >  fw_do_boot: 369.713760 (+206.102254)
> > > >  linux_start_boot: 370.960364 (+1.246604)
> > > >  linux_start_user: 796.799667 (+425.839303)
> > > > 
> > > > For OVMF case, are reasonable the times that I measured? Do you
> > > > use a
> > > > different configuration?
> > > > 
> > > 
> > > That is the same OVMF configuration as we normally use, I have
> > > the
> > > configuration file checked in. Those numbers look like the ones I
> > > expect.
> > 
> > Perfect!
> > 
> > Please, let me know if you have something more important than boot
> > time to help you.
> > 
> > Regards,
> > Stefano
> > 
> > > 
> > > > Thanks,
> > > > Stefano
> > > > 
> > > 
> > > Great work,
> > > 
> > > Rob
> > > 
> > 
> > --
> > Stefano Garzarella
> > Red Hat
> 
> 
> 
> --
> Stefano Garzarella
> Red Hat

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-03 15:44       ` Rob Bradford
@ 2018-12-03 16:35         ` Stefano Garzarella
  2018-12-04 22:44           ` Maran Wilson
  2018-12-05 12:26           ` Philippe Mathieu-Daudé
  0 siblings, 2 replies; 18+ messages in thread
From: Stefano Garzarella @ 2018-12-03 16:35 UTC (permalink / raw)
  To: Rob Bradford; +Cc: Samuel Ortiz, qemu-devel, Stefan Hajnoczi

[-- Attachment #1: Type: text/plain, Size: 2683 bytes --]

On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
>
> Hi Stefano, thanks for capturing all these numbers,
>
> On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
> > Hi Rob,
> > I continued to investigate the boot time, and as you suggested I
> > looked also at qemu-lite 2.11.2
> > (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> > did the following tests using the Kata kernel configuration
> > (
> > https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
> > )
> >
> > To compare the results with qemu-lite direct kernel load, I added
> > another tracepoint:
> > - linux_start_kernel: first entry of the Linux kernel
> > (start_kernel())
> >
>
> Great, do you have a set of patches available that all these trace
> points. It would be great for reproduction.

For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
I'm also sharing a python script that I'm using with perf to extract
the numbers in this way:

$ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
sched:sched_process_exec -o /tmp/qemu_perf.data &
$ # start qemu/nemu multiple times
$ killall perf
$ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data

>
> > As you can see, NEMU is faster to jump to the kernel
> > (linux_start_kernel) than qemu-lite when uses qboot or seabios with
> > virt support, but the time to the user space is strangely high, maybe
> > the kernel configuration that I used is not the best one.
> > Do you suggest another kernel configuration?
> >
>
> This looks very bad. This isn't the kernel configuration we normally
> test with in our automated test system but is definitely one we support
> as part of our partnernship with the Kata team. It's a high priority
> for me to try and investigate that. Have you saved the kernel messages
> as they might be helpful?

Yes, I'm attaching the dmesg output with nemu and qemu.

>
> > Anyway, I obtained the best boot time with qemu-lite and direct
> > kernel
> > load (vmlinux ELF image). I think because the kernel was not
> > compressed. Indeed, looking to the others test, the kernel
> > decompression (bzImage) takes about 80 ms (linux_start_kernel -
> > linux_start_boot). (I'll investigate better)
> >
>
> Yup being able to load an uncompressed kernel is one of the big
> advantages of qemu-lite. I wonder if we could bring that feature into
> qemu itself to supplement the existing firmware based kernel loading.

I think so, I'll try to understand if we can merge the qemu-lite
direct kernel loading in qemu.


Thanks,
Stefano


-- 
Stefano Garzarella
Red Hat

[-- Attachment #2: qemu_dmesg.txt --]
[-- Type: text/plain, Size: 21728 bytes --]


[    0.000000] Linux version 4.14.67 (stefano@stelenovo) (gcc version 7.3.0 (Buildroot 2018.11-rc1-00017-g6a74acb6fb)) #3 SMP Mon Dec 3 15:56:35 CET 2018
[    0.000000] Command line: root=/dev/pmem0 ro console=hvc0
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001ffdefff] usable
[    0.000000] BIOS-e820: [mem 0x000000001ffdf000-0x000000001fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] Hypervisor detected: KVM
[    0.000000] tsc: Fast TSC calibration using PIT
[    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] e820: last_pfn = 0x1ffdf max_arch_pfn = 0x400000000
[    0.000000] MTRR default type: write-back
[    0.000000] MTRR fixed ranges enabled:
[    0.000000]   00000-9FFFF write-back
[    0.000000]   A0000-BFFFF uncachable
[    0.000000]   C0000-FFFFF write-protect
[    0.000000] MTRR variable ranges enabled:
[    0.000000]   0 base 0080000000 mask FF80000000 uncachable
[    0.000000]   1 disabled
[    0.000000]   2 disabled
[    0.000000]   3 disabled
[    0.000000]   4 disabled
[    0.000000]   5 disabled
[    0.000000]   6 disabled
[    0.000000]   7 disabled
[    0.000000] x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WB  WT  UC- UC  
[    0.000000] found SMP MP-table at [mem 0x000f3ee0-0x000f3eef] mapped at [ffffffffff200ee0]
[    0.000000] Base memory trampoline at [ffff880000099000] 99000 size 24576
[    0.000000] Using GB pages for direct mapping
[    0.000000] BRK [0x024f3000, 0x024f3fff] PGTABLE
[    0.000000] BRK [0x024f4000, 0x024f4fff] PGTABLE
[    0.000000] BRK [0x024f5000, 0x024f5fff] PGTABLE
[    0.000000] BRK [0x024f6000, 0x024f6fff] PGTABLE
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x00000000000F3D80 000014 (v00 BOCHS )
[    0.000000] ACPI: RSDT 0x000000001FFE0E17 00003C (v01 BOCHS  BXPCRSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: FACP 0x000000001FFE089E 000074 (v01 BOCHS  BXPCFACP 00000001 BXPC 00000001)
[    0.000000] ACPI: DSDT 0x000000001FFDF040 00185E (v01 BOCHS  BXPCDSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: FACS 0x000000001FFDF000 000040
[    0.000000] ACPI: APIC 0x000000001FFE0912 000078 (v01 BOCHS  BXPCAPIC 00000001 BXPC 00000001)
[    0.000000] ACPI: HPET 0x000000001FFE098A 000038 (v01 BOCHS  BXPCHPET 00000001 BXPC 00000001)
[    0.000000] ACPI: SRAT 0x000000001FFE09C2 000108 (v01 BOCHS  BXPCSRAT 00000001 BXPC 00000001)
[    0.000000] ACPI: SSDT 0x000000001FFE0ACA 00026D (v01 BOCHS  NVDIMM   00000001 BXPC 00000001)
[    0.000000] ACPI: NFIT 0x000000001FFE0D37 0000E0 (v01 BOCHS  BXPCNFIT 00000001 BXPC 00000001)
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[    0.000000] kvm-clock: cpu 0, msr 0:1ffdb001, primary cpu clock
[    0.000000] kvm-clock: using sched offset of 134980024 cycles
[    0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.000000]   DMA32    [mem 0x0000000001000000-0x000000001ffdefff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009efff]
[    0.000000]   node   0: [mem 0x0000000000100000-0x000000001ffdefff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000001ffdefff]
[    0.000000] On node 0 totalpages: 130941
[    0.000000]   DMA zone: 64 pages used for memmap
[    0.000000]   DMA zone: 21 pages reserved
[    0.000000]   DMA zone: 3998 pages, LIFO batch:0
[    0.000000]   DMA32 zone: 1984 pages used for memmap
[    0.000000]   DMA32 zone: 126943 pages, LIFO batch:31
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
[    0.000000] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[    0.000000] ACPI: IRQ0 used by override.
[    0.000000] ACPI: IRQ5 used by override.
[    0.000000] ACPI: IRQ9 used by override.
[    0.000000] ACPI: IRQ10 used by override.
[    0.000000] ACPI: IRQ11 used by override.
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] ACPI: HPET id: 0x8086a201 base: 0xfed00000
[    0.000000] smpboot: Allowing 1 CPUs, 0 hotplug CPUs
[    0.000000] e820: [mem 0x20000000-0xfeffbfff] available for PCI devices
[    0.000000] Booting paravirtualized kernel on KVM
[    0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[    0.000000] setup_percpu: NR_CPUS:240 nr_cpumask_bits:240 nr_cpu_ids:1 nr_node_ids:1
[    0.000000] percpu: Embedded 42 pages/cpu @ffff88001fc00000 s131800 r8192 d32040 u2097152
[    0.000000] pcpu-alloc: s131800 r8192 d32040 u2097152 alloc=1*2097152
[    0.000000] pcpu-alloc: [0] 0 
[    0.000000] KVM setup async PF for cpu 0
[    0.000000] kvm-stealtime: cpu 0, msr 1fc14e80
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 128872
[    0.000000] Kernel command line: root=/dev/pmem0 ro console=hvc0
[    0.000000] PID hash table entries: 2048 (order: 2, 16384 bytes)
[    0.000000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
[    0.000000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
[    0.000000] Memory: 492988K/523764K available (10248K kernel code, 601K rwdata, 2812K rodata, 832K init, 524K bss, 30776K reserved, 0K cma-reserved)
[    0.000000] Kernel/User page tables isolation: enabled
[    0.001000] Hierarchical RCU implementation.
[    0.001000] 	RCU restricting CPUs from NR_CPUS=240 to nr_cpu_ids=1.
[    0.001000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.001000] NR_IRQS: 15616, nr_irqs: 256, preallocated irqs: 16
[    0.001000] 	Offload RCU callbacks from CPUs: .
[    0.001000] Console: colour *CGA 80x25
[    0.001000] clocksource: hpet: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 19112604467 ns
[    0.001000] hpet clockevent registered
[    0.001008] tsc: Detected 2494.222 MHz processor
[    0.001012] Calibrating delay loop (skipped) preset value.. 4988.44 BogoMIPS (lpj=2494222)
[    0.001012] pid_max: default: 32768 minimum: 301
[    0.001021] ACPI: Core revision 20170728
[    0.001826] ACPI: 2 ACPI AML tables successfully acquired and loaded
[    0.001875] Security Framework initialized
[    0.001888] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes)
[    0.001890] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[    0.002135] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[    0.002136] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
[    0.002139] Spectre V2 : Mitigation: Full generic retpoline
[    0.002140] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[    0.002140] Spectre V2 : Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier
[    0.002141] Spectre V2 : Enabling Restricted Speculation for firmware calls
[    0.002142] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp
[    0.007969] Freeing SMP alternatives memory: 32K
[    0.008509] smpboot: Max logical packages: 1
[    0.008860] x2apic enabled
[    0.009006] Switched APIC routing to physical x2apic.
[    0.011000] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.011000] TSC deadline timer enabled
[    0.011000] smpboot: CPU0: Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz (family: 0x6, model: 0x3c, stepping: 0x3)
[    0.011000] Performance Events: Haswell events, Intel PMU driver.
[    0.011000] ... version:                2
[    0.011000] ... bit width:              48
[    0.011000] ... generic registers:      4
[    0.011000] ... value mask:             0000ffffffffffff
[    0.011000] ... max period:             000000007fffffff
[    0.011000] ... fixed-purpose events:   3
[    0.011000] ... event mask:             000000070000000f
[    0.011000] Hierarchical SRCU implementation.
[    0.011000] smp: Bringing up secondary CPUs ...
[    0.011000] smp: Brought up 1 node, 1 CPU
[    0.011000] smpboot: Total of 1 processors activated (4988.44 BogoMIPS)
[    0.011000] devtmpfs: initialized
[    0.011000] x86/mm: Memory block size: 128MB
[    0.011082] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[    0.011087] futex hash table entries: 256 (order: 2, 16384 bytes)
[    0.011241] NET: Registered protocol family 16
[    0.011381] cpuidle: using governor menu
[    0.011399] ACPI: bus type PCI registered
[    0.011401] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.011525] PCI: Using configuration type 1 for base access
[    0.011550] core: PMU erratum BJ122, BV98, HSD29 workaround disabled, HT off
[    0.012658] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
[    0.012660] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.012746] ACPI: Added _OSI(Module Device)
[    0.012747] ACPI: Added _OSI(Processor Device)
[    0.012748] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.012749] ACPI: Added _OSI(Processor Aggregator Device)
[    0.013820] ACPI: Interpreter enabled
[    0.013825] ACPI: (supports S0 S5)
[    0.013826] ACPI: Using IOAPIC for interrupt routing
[    0.013833] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.014000] ACPI: Enabled 4 GPEs in block 00 to 0F
[    0.015809] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[    0.015813] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI]
[    0.015817] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM
[    0.015820] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge.
[    0.016059] acpiphp: Slot [2] registered
[    0.016095] acpiphp: Slot [3] registered
[    0.016130] acpiphp: Slot [4] registered
[    0.016166] acpiphp: Slot [5] registered
[    0.016201] acpiphp: Slot [6] registered
[    0.016236] acpiphp: Slot [7] registered
[    0.016272] acpiphp: Slot [8] registered
[    0.016307] acpiphp: Slot [9] registered
[    0.016342] acpiphp: Slot [10] registered
[    0.016377] acpiphp: Slot [11] registered
[    0.016414] acpiphp: Slot [12] registered
[    0.016450] acpiphp: Slot [13] registered
[    0.016485] acpiphp: Slot [14] registered
[    0.016519] acpiphp: Slot [15] registered
[    0.016555] acpiphp: Slot [16] registered
[    0.016589] acpiphp: Slot [17] registered
[    0.016626] acpiphp: Slot [18] registered
[    0.016661] acpiphp: Slot [19] registered
[    0.016696] acpiphp: Slot [20] registered
[    0.016732] acpiphp: Slot [21] registered
[    0.016766] acpiphp: Slot [22] registered
[    0.016801] acpiphp: Slot [23] registered
[    0.016837] acpiphp: Slot [24] registered
[    0.016901] acpiphp: Slot [25] registered
[    0.016936] acpiphp: Slot [26] registered
[    0.016971] acpiphp: Slot [27] registered
[    0.017006] acpiphp: Slot [28] registered
[    0.017042] acpiphp: Slot [29] registered
[    0.017076] acpiphp: Slot [30] registered
[    0.017112] acpiphp: Slot [31] registered
[    0.017125] PCI host bridge to bus 0000:00
[    0.017127] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.017129] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    0.017130] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.017132] pci_bus 0000:00: root bus resource [mem 0x20000000-0xfebfffff window]
[    0.017133] pci_bus 0000:00: root bus resource [mem 0x180000000-0x1ffffffff window]
[    0.017135] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.017179] pci 0000:00:00.0: [8086:1237] type 00 class 0x060000
[    0.017695] pci 0000:00:01.0: [8086:7000] type 00 class 0x060100
[    0.018434] pci 0000:00:01.1: [8086:7010] type 00 class 0x010180
[    0.020223] pci 0000:00:01.1: reg 0x20: [io  0xc040-0xc04f]
[    0.021037] pci 0000:00:01.1: legacy IDE quirk: reg 0x10: [io  0x01f0-0x01f7]
[    0.021039] pci 0000:00:01.1: legacy IDE quirk: reg 0x14: [io  0x03f6]
[    0.021040] pci 0000:00:01.1: legacy IDE quirk: reg 0x18: [io  0x0170-0x0177]
[    0.021042] pci 0000:00:01.1: legacy IDE quirk: reg 0x1c: [io  0x0376]
[    0.021230] pci 0000:00:01.3: [8086:7113] type 00 class 0x068000
[    0.021834] pci 0000:00:01.3: quirk: [io  0x0600-0x063f] claimed by PIIX4 ACPI
[    0.021856] pci 0000:00:01.3: quirk: [io  0x0700-0x070f] claimed by PIIX4 SMB
[    0.022179] pci 0000:00:02.0: [1af4:1003] type 00 class 0x078000
[    0.023936] pci 0000:00:02.0: reg 0x10: [io  0xc000-0xc03f]
[    0.024490] pci 0000:00:02.0: reg 0x14: [mem 0xfebff000-0xfebfffff]
[    0.027009] pci 0000:00:02.0: reg 0x20: [mem 0xfebf8000-0xfebfbfff 64bit pref]
[    0.028925] pci_bus 0000:00: on NUMA node 0
[    0.029186] ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11)
[    0.029286] ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11)
[    0.029374] ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11)
[    0.029459] ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11)
[    0.029508] ACPI: PCI Interrupt Link [LNKS] (IRQs *9)
[    0.029713] SCSI subsystem initialized
[    0.029722] pps_core: LinuxPPS API ver. 1 registered
[    0.029723] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    0.029725] PTP clock support registered
[    0.029727] PCI: Using ACPI for IRQ routing
[    0.029728] PCI: pci_cache_line_size set to 64 bytes
[    0.029855] e820: reserve RAM buffer [mem 0x0009fc00-0x0009ffff]
[    0.029856] e820: reserve RAM buffer [mem 0x1ffdf000-0x1fffffff]
[    0.029968] clocksource: Switched to clocksource kvm-clock
[    0.029976] pnp: PnP ACPI init
[    0.029976] pnp 00:00: Plug and Play ACPI device, IDs PNP0b00 (active)
[    0.029976] pnp 00:01: Plug and Play ACPI device, IDs PNP0303 (active)
[    0.029976] pnp 00:02: Plug and Play ACPI device, IDs PNP0f13 (active)
[    0.029976] pnp 00:03: [dma 2]
[    0.029976] pnp 00:03: Plug and Play ACPI device, IDs PNP0700 (active)
[    0.029976] pnp: PnP ACPI: found 4 devices
[    0.029976] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7 window]
[    0.029976] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff window]
[    0.029976] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.029976] pci_bus 0000:00: resource 7 [mem 0x20000000-0xfebfffff window]
[    0.029976] pci_bus 0000:00: resource 8 [mem 0x180000000-0x1ffffffff window]
[    0.029993] NET: Registered protocol family 2
[    0.030091] TCP established hash table entries: 4096 (order: 3, 32768 bytes)
[    0.030097] TCP bind hash table entries: 4096 (order: 4, 65536 bytes)
[    0.030104] TCP: Hash tables configured (established 4096 bind 4096)
[    0.030113] UDP hash table entries: 256 (order: 1, 8192 bytes)
[    0.030116] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[    0.030142] NET: Registered protocol family 1
[    0.030148] pci 0000:00:00.0: Limiting direct PCI/PCI transfers
[    0.030167] pci 0000:00:01.0: PIIX3: Enabling Passive Release
[    0.030189] pci 0000:00:01.0: Activating ISA DMA hang workarounds
[    0.030231] PCI: CLS 0 bytes, default 64
[    0.030275] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x23f3e76dd04, max_idle_ns: 440795285707 ns
[    0.030481] workingset: timestamp_bits=46 max_order=17 bucket_order=0
[    0.030559] SGI XFS with security attributes, no debug enabled
[    0.030647] 9p: Installing v9fs 9p2000 file system support
[    0.031884] NET: Registered protocol family 38
[    0.031898] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 249)
[    0.031900] io scheduler noop registered
[    0.031901] io scheduler deadline registered (default)
[    0.031912] io scheduler cfq registered
[    0.031913] io scheduler mq-deadline registered
[    0.031914] io scheduler kyber registered
[    0.031964] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[    0.032005] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0
[    0.032021] ACPI: Power Button [PWRF]
[    0.055432] ACPI: PCI Interrupt Link [LNKB] enabled at IRQ 10
[    0.057373] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.073838] console [hvc0] enabled
[    0.074847] brd: module loaded
[    0.081127] loop: module loaded
[    0.081466] pmem0: detected capacity change from 0 to 62914560
[    0.081628] tun: Universal TUN/TAP device driver, 1.6
[    0.081715] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 5.1.0-k
[    0.081789] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[    0.081917] ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver - version 4.1.0-k
[    0.082019] ixgbevf: Copyright (c) 2009 - 2015 Intel Corporation.
[    0.082112] i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.14-k
[    0.082184] i40e: Copyright (c) 2013 - 2014 Intel Corporation.
[    0.082275] i40evf: Intel(R) 40-10 Gigabit Virtual Function Network Driver - version 3.0.0-k
[    0.082358] Copyright (c) 2013 - 2015 Intel Corporation.
[    0.082462] VFIO - User Level meta-driver version: 0.3
[    0.082567] IR NEC protocol handler initialized
[    0.082618] IR RC5(x/sz) protocol handler initialized
[    0.082670] IR RC6 protocol handler initialized
[    0.082712] IR JVC protocol handler initialized
[    0.082764] IR Sony protocol handler initialized
[    0.082807] IR SANYO protocol handler initialized
[    0.082850] IR Sharp protocol handler initialized
[    0.082910] IR MCE Keyboard/mouse protocol handler initialized
[    0.082972] IR XMP protocol handler initialized
[    0.083094] device-mapper: ioctl: 4.37.0-ioctl (2017-09-20) initialised: dm-devel@redhat.com
[    0.083236] Netfilter messages via NETLINK v0.30.
[    0.083290] nfnl_acct: registering with nfnetlink.
[    0.083372] nf_conntrack version 0.5.0 (4096 buckets, 16384 max)
[    0.083467] ctnetlink v0.93: registering with nfnetlink.
[    0.083571] xt_time: kernel timezone is -0000
[    0.083622] ip_set: protocol 6
[    0.083669] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
[    0.083739] IPVS: Connection hash table configured (size=4096, memory=64Kbytes)
[    0.083821] random: get_random_bytes called from ip_vs_conn_init+0xec/0xf1 with crng_init=0
[    0.083972] IPVS: ipvs loaded.
[    0.084036] IPVS: [rr] scheduler registered.
[    0.084089] IPVS: [wrr] scheduler registered.
[    0.084133] IPVS: [lc] scheduler registered.
[    0.084175] IPVS: [wlc] scheduler registered.
[    0.084218] IPVS: [fo] scheduler registered.
[    0.084263] IPVS: [ovf] scheduler registered.
[    0.084308] IPVS: [lblc] scheduler registered.
[    0.084358] IPVS: [lblcr] scheduler registered.
[    0.084408] IPVS: [dh] scheduler registered.
[    0.084451] IPVS: [sh] scheduler registered.
[    0.084493] IPVS: [sed] scheduler registered.
[    0.084544] IPVS: [nq] scheduler registered.
[    0.084597] IPVS: ftp: loaded support on port[0] = 21
[    0.084646] IPVS: [sip] pe registered.
[    0.084750] ip_tables: (C) 2000-2006 Netfilter Core Team
[    0.084823] ipt_CLUSTERIP: ClusterIP Version 0.8 loaded successfully
[    0.084916] arp_tables: arp_tables: (C) 2002 David S. Miller
[    0.084981] Initializing XFRM netlink socket
[    0.085069] NET: Registered protocol family 10
[    0.085254] Segment Routing with IPv6
[    0.085305] NET: Registered protocol family 17
[    0.085365] 9pnet: Installing 9P2000 support
[    0.085446] sched_clock: Marking stable (85033745, 0)->(175671879, -90638134)
[    0.085569] registered taskstats version 1
[    0.085994] EXT4-fs (pmem0): mounting ext2 file system using the ext4 subsystem
[    0.086259] EXT4-fs (pmem0): mounted filesystem without journal. Opts: (null)
[    0.086333] VFS: Mounted root (ext2 filesystem) readonly on device 259:0.
[    0.086449] devtmpfs: mounted
[    0.086835] Freeing unused kernel memory: 832K
[    0.089055] Write protecting the kernel read-only data: 16384k
[    0.089521] Freeing unused kernel memory: 2020K
[    0.091785] Freeing unused kernel memory: 1284K
[    0.094677] EXT4-fs (pmem0): warning: mounting unchecked fs, running e2fsck is recommended
[    0.094952] EXT4-fs (pmem0): re-mounted. Opts: block_validity,barrier,user_xattr,acl
[    0.105154] random: dd: uninitialized urandom read (512 bytes read)

[-- Attachment #3: nemu_dmesg.txt --]
[-- Type: text/plain, Size: 20298 bytes --]


[    0.000000] Linux version 4.14.67 (stefano@stelenovo) (gcc version 7.3.0 (Buildroot 2018.11-rc1-00017-g6a74acb6fb)) #3 SMP Mon Dec 3 15:56:35 CET 2018
[    0.000000] Command line: root=/dev/pmem0 ro console=hvc0
[    0.000000] KERNEL supported cpus:
[    0.000000]   Intel GenuineIntel
[    0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers'
[    0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers'
[    0.000000] x86/fpu: xstate_offset[2]:  576, xstate_sizes[2]:  256
[    0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format.
[    0.000000] e820: BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009fbff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009fc00-0x000000000009ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000000f0000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x000000001fffcfff] usable
[    0.000000] BIOS-e820: [mem 0x000000001fffd000-0x000000001fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000080000000-0x000000008fffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000feffc000-0x00000000feffffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fffc0000-0x00000000ffffffff] reserved
[    0.000000] NX (Execute Disable) protection: active
[    0.000000] Hypervisor detected: KVM
[    0.000000] tsc: Fast TSC calibration failed
[    0.000000] tsc: Unable to calibrate against PIT
[    0.000000] tsc: No reference (HPET/PMTIMER) available
[    0.000000] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved
[    0.000000] e820: remove [mem 0x000a0000-0x000fffff] usable
[    0.000000] e820: last_pfn = 0x1fffd max_arch_pfn = 0x400000000
[    0.000000] MTRR default type: write-back
[    0.000000] MTRR fixed ranges enabled:
[    0.000000]   00000-9FFFF write-back
[    0.000000]   A0000-BFFFF uncachable
[    0.000000]   C0000-FFFFF write-protect
[    0.000000] MTRR variable ranges enabled:
[    0.000000]   0 base 0020000000 mask FF20000000 uncachable
[    0.000000]   1 disabled
[    0.000000]   2 disabled
[    0.000000]   3 disabled
[    0.000000]   4 disabled
[    0.000000]   5 disabled
[    0.000000]   6 disabled
[    0.000000]   7 disabled
[    0.000000] x86/PAT: Configuration [0-7]: WB  WT  UC- UC  WB  WT  UC- UC  
[    0.000000] found SMP MP-table at [mem 0x000f3ee0-0x000f3eef] mapped at [ffffffffff200ee0]
[    0.000000] Base memory trampoline at [ffff880000099000] 99000 size 24576
[    0.000000] Using GB pages for direct mapping
[    0.000000] BRK [0x024f3000, 0x024f3fff] PGTABLE
[    0.000000] BRK [0x024f4000, 0x024f4fff] PGTABLE
[    0.000000] BRK [0x024f5000, 0x024f5fff] PGTABLE
[    0.000000] BRK [0x024f6000, 0x024f6fff] PGTABLE
[    0.000000] ACPI: Early table checksum verification disabled
[    0.000000] ACPI: RSDP 0x00000000000F3D80 000024 (v02 BOCHS )
[    0.000000] ACPI: XSDT 0x000000001FFFEF7C 000054 (v01 BOCHS  BXPCXSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: FACP 0x000000001FFFE967 00010C (v05 BOCHS  BXPCFACP 00000001 BXPC 00000001)
[    0.000000] ACPI: DSDT 0x000000001FFFD7C0 0011A7 (v02 BOCHS  BXPCDSDT 00000001 BXPC 00000001)
[    0.000000] ACPI: APIC 0x000000001FFFEA73 000078 (v01 BOCHS  BXPCAPIC 00000001 BXPC 00000001)
[    0.000000] ACPI: SRAT 0x000000001FFFEAEB 000108 (v01 BOCHS  BXPCSRAT 00000001 BXPC 00000001)
[    0.000000] ACPI: MCFG 0x000000001FFFEBF3 00003C (v01 BOCHS  BXPCMCFG 00000001 BXPC 00000001)
[    0.000000] ACPI: SSDT 0x000000001FFFEC2F 00026D (v01 BOCHS  NVDIMM   00000001 BXPC 00000001)
[    0.000000] ACPI: NFIT 0x000000001FFFEE9C 0000E0 (v01 BOCHS  BXPCNFIT 00000001 BXPC 00000001)
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00
[    0.000000] kvm-clock: cpu 0, msr 0:1fff9001, primary cpu clock
[    0.000000] kvm-clock: using sched offset of 577000846 cycles
[    0.000000] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns
[    0.000000] Zone ranges:
[    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
[    0.000000]   DMA32    [mem 0x0000000001000000-0x000000001fffcfff]
[    0.000000]   Normal   empty
[    0.000000] Movable zone start for each node
[    0.000000] Early memory node ranges
[    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009efff]
[    0.000000]   node   0: [mem 0x0000000000100000-0x000000001fffcfff]
[    0.000000] Initmem setup node 0 [mem 0x0000000000001000-0x000000001fffcfff]
[    0.000000] On node 0 totalpages: 130971
[    0.000000]   DMA zone: 64 pages used for memmap
[    0.000000]   DMA zone: 21 pages reserved
[    0.000000]   DMA zone: 3998 pages, LIFO batch:0
[    0.000000]   DMA32 zone: 1984 pages used for memmap
[    0.000000]   DMA32 zone: 126973 pages, LIFO batch:31
[    0.000000] ACPI: Local APIC address 0xfee00000
[    0.000000] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1])
[    0.000000] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level)
[    0.000000] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level)
[    0.000000] Using ACPI (MADT) for SMP configuration information
[    0.000000] smpboot: Allowing 1 CPUs, 0 hotplug CPUs
[    0.000000] e820: [mem 0x90000000-0xfeffbfff] available for PCI devices
[    0.000000] Booting paravirtualized kernel on KVM
[    0.000000] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns
[    0.000000] setup_percpu: NR_CPUS:240 nr_cpumask_bits:240 nr_cpu_ids:1 nr_node_ids:1
[    0.000000] percpu: Embedded 42 pages/cpu @ffff88001fc00000 s131800 r8192 d32040 u2097152
[    0.000000] pcpu-alloc: s131800 r8192 d32040 u2097152 alloc=1*2097152
[    0.000000] pcpu-alloc: [0] 0 
[    0.000000] KVM setup async PF for cpu 0
[    0.000000] kvm-stealtime: cpu 0, msr 1fc14e80
[    0.000000] Built 1 zonelists, mobility grouping on.  Total pages: 128902
[    0.000000] Kernel command line: root=/dev/pmem0 ro console=hvc0
[    0.000000] PID hash table entries: 2048 (order: 2, 16384 bytes)
[    0.000000] Dentry cache hash table entries: 65536 (order: 7, 524288 bytes)
[    0.000000] Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
[    0.000000] Memory: 493108K/523884K available (10248K kernel code, 601K rwdata, 2812K rodata, 832K init, 524K bss, 30776K reserved, 0K cma-reserved)
[    0.000000] Kernel/User page tables isolation: enabled
[    0.001000] Hierarchical RCU implementation.
[    0.001000] 	RCU restricting CPUs from NR_CPUS=240 to nr_cpu_ids=1.
[    0.001000] RCU: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=1
[    0.001000] NR_IRQS: 15616, nr_irqs: 256, preallocated irqs: 0
[    0.001000] 	Offload RCU callbacks from CPUs: .
[    0.001000] Console: colour *CGA 80x25
[    0.001000] tsc: Detected 2494.222 MHz processor
[    0.001000] Calibrating delay loop (skipped) preset value.. 4988.44 BogoMIPS (lpj=2494222)
[    0.001000] pid_max: default: 32768 minimum: 301
[    0.001000] ACPI: Core revision 20170728
[    0.001000] ACPI: 2 ACPI AML tables successfully acquired and loaded
[    0.001000] Security Framework initialized
[    0.001000] Mount-cache hash table entries: 1024 (order: 1, 8192 bytes)
[    0.001000] Mountpoint-cache hash table entries: 1024 (order: 1, 8192 bytes)
[    0.001000] Last level iTLB entries: 4KB 0, 2MB 0, 4MB 0
[    0.001000] Last level dTLB entries: 4KB 0, 2MB 0, 4MB 0, 1GB 0
[    0.001000] Spectre V2 : Mitigation: Full generic retpoline
[    0.001000] Spectre V2 : Spectre v2 / SpectreRSB mitigation: Filling RSB on context switch
[    0.001000] Spectre V2 : Spectre v2 mitigation: Enabling Indirect Branch Prediction Barrier
[    0.001000] Spectre V2 : Enabling Restricted Speculation for firmware calls
[    0.001000] Speculative Store Bypass: Mitigation: Speculative Store Bypass disabled via prctl and seccomp
[    0.001000] Freeing SMP alternatives memory: 32K
[    0.001000] smpboot: Max logical packages: 1
[    0.001000] x2apic enabled
[    0.001000] Switched APIC routing to physical x2apic.
[    0.001000] TSC deadline timer enabled
[    0.001000] smpboot: CPU0: Intel(R) Core(TM) i5-4200M CPU @ 2.50GHz (family: 0x6, model: 0x3c, stepping: 0x3)
[    0.001000] Performance Events: Haswell events, Intel PMU driver.
[    0.001000] ... version:                2
[    0.001000] ... bit width:              48
[    0.001000] ... generic registers:      4
[    0.001000] ... value mask:             0000ffffffffffff
[    0.001000] ... max period:             000000007fffffff
[    0.001000] ... fixed-purpose events:   3
[    0.001000] ... event mask:             000000070000000f
[    0.001000] Hierarchical SRCU implementation.
[    0.001000] smp: Bringing up secondary CPUs ...
[    0.001000] smp: Brought up 1 node, 1 CPU
[    0.001000] smpboot: Total of 1 processors activated (4988.44 BogoMIPS)
[    0.001000] devtmpfs: initialized
[    0.001000] x86/mm: Memory block size: 128MB
[    0.001000] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns
[    0.001000] futex hash table entries: 256 (order: 2, 16384 bytes)
[    0.001000] NET: Registered protocol family 16
[    0.001000] cpuidle: using governor menu
[    0.001000] ACPI: bus type PCI registered
[    0.001000] acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
[    0.001000] PCI: MMCONFIG for domain 0000 [bus 00-ff] at [mem 0x80000000-0x8fffffff] (base 0x80000000)
[    0.001000] PCI: MMCONFIG at [mem 0x80000000-0x8fffffff] reserved in E820
[    0.001000] PCI: Using configuration type 1 for base access
[    0.001000] core: PMU erratum BJ122, BV98, HSD29 workaround disabled, HT off
[    0.001638] HugeTLB registered 1.00 GiB page size, pre-allocated 0 pages
[    0.001640] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages
[    0.001725] ACPI: Added _OSI(Module Device)
[    0.001726] ACPI: Added _OSI(Processor Device)
[    0.001727] ACPI: Added _OSI(3.0 _SCP Extensions)
[    0.001728] ACPI: Added _OSI(Processor Aggregator Device)
[    0.002039] ACPI: Interpreter enabled
[    0.002043] ACPI: (supports S0 S5)
[    0.002044] ACPI: Using IOAPIC for interrupt routing
[    0.002051] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug
[    0.003159] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff])
[    0.003163] acpi PNP0A08:00: _OSC: OS supports [ExtendedConfig ASPM ClockPM Segments MSI]
[    0.003244] acpi PNP0A08:00: _OSC: OS now controls [PCIeHotplug PME AER PCIeCapability]
[    0.003428] acpiphp: Slot [1] registered
[    0.003464] acpiphp: Slot [2] registered
[    0.003498] acpiphp: Slot [3] registered
[    0.003536] acpiphp: Slot [4] registered
[    0.003592] acpiphp: Slot [5] registered
[    0.003650] acpiphp: Slot [6] registered
[    0.003705] acpiphp: Slot [7] registered
[    0.003747] acpiphp: Slot [8] registered
[    0.003781] acpiphp: Slot [9] registered
[    0.003816] acpiphp: Slot [10] registered
[    0.003851] acpiphp: Slot [11] registered
[    0.003886] acpiphp: Slot [12] registered
[    0.003920] acpiphp: Slot [13] registered
[    0.003955] acpiphp: Slot [14] registered
[    0.003989] acpiphp: Slot [15] registered
[    0.004027] acpiphp: Slot [16] registered
[    0.004062] acpiphp: Slot [17] registered
[    0.004097] acpiphp: Slot [18] registered
[    0.004131] acpiphp: Slot [19] registered
[    0.004187] acpiphp: Slot [20] registered
[    0.004221] acpiphp: Slot [21] registered
[    0.004256] acpiphp: Slot [22] registered
[    0.004290] acpiphp: Slot [23] registered
[    0.004325] acpiphp: Slot [24] registered
[    0.004360] acpiphp: Slot [25] registered
[    0.004404] acpiphp: Slot [26] registered
[    0.004443] acpiphp: Slot [27] registered
[    0.004490] acpiphp: Slot [28] registered
[    0.004538] acpiphp: Slot [29] registered
[    0.004576] acpiphp: Slot [30] registered
[    0.004636] acpiphp: Slot [31] registered
[    0.004649] PCI host bridge to bus 0000:00
[    0.004651] pci_bus 0000:00: root bus resource [io  0x0000-0x0cf7 window]
[    0.004652] pci_bus 0000:00: root bus resource [io  0x0d00-0xffff window]
[    0.004654] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window]
[    0.004655] pci_bus 0000:00: root bus resource [mem 0x90000000-0xfec00000 window]
[    0.004657] pci_bus 0000:00: root bus resource [mem 0x180000000-0x97fffffff window]
[    0.004658] pci_bus 0000:00: root bus resource [bus 00-ff]
[    0.004701] pci 0000:00:00.0: [8086:0d57] type 00 class 0x060000
[    0.005378] pci 0000:00:01.0: [1af4:1003] type 00 class 0x078000
[    0.006010] pci 0000:00:01.0: reg 0x10: [io  0xc000-0xc03f]
[    0.006732] pci 0000:00:01.0: reg 0x14: [mem 0xfebff000-0xfebfffff]
[    0.008394] pci 0000:00:01.0: reg 0x20: [mem 0xfebf8000-0xfebfbfff 64bit pref]
[    0.009833] pci_bus 0000:00: on NUMA node 0
[    0.010103] SCSI subsystem initialized
[    0.010114] pps_core: LinuxPPS API ver. 1 registered
[    0.010115] pps_core: Software ver. 5.3.6 - Copyright 2005-2007 Rodolfo Giometti <giometti@linux.it>
[    0.010117] PTP clock support registered
[    0.010118] PCI: Using ACPI for IRQ routing
[    0.089044] PCI: pci_cache_line_size set to 64 bytes
[    0.089092] e820: reserve RAM buffer [mem 0x0009fc00-0x0009ffff]
[    0.089094] e820: reserve RAM buffer [mem 0x1fffd000-0x1fffffff]
[    0.089209] clocksource: Switched to clocksource kvm-clock
[    0.089278] pnp: PnP ACPI init
[    0.089404] pnp: PnP ACPI: found 0 devices
[    0.089888] pci_bus 0000:00: resource 4 [io  0x0000-0x0cf7 window]
[    0.089889] pci_bus 0000:00: resource 5 [io  0x0d00-0xffff window]
[    0.089890] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window]
[    0.089891] pci_bus 0000:00: resource 7 [mem 0x90000000-0xfec00000 window]
[    0.089892] pci_bus 0000:00: resource 8 [mem 0x180000000-0x97fffffff window]
[    0.089932] NET: Registered protocol family 2
[    0.089959] TCP established hash table entries: 4096 (order: 3, 32768 bytes)
[    0.089959] TCP bind hash table entries: 4096 (order: 4, 65536 bytes)
[    0.089959] TCP: Hash tables configured (established 4096 bind 4096)
[    0.089959] UDP hash table entries: 256 (order: 1, 8192 bytes)
[    0.089959] UDP-Lite hash table entries: 256 (order: 1, 8192 bytes)
[    0.089959] NET: Registered protocol family 1
[    0.089959] PCI: CLS 0 bytes, default 64
[    0.089965] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x23f3e76dd04, max_idle_ns: 440795285707 ns
[    0.089984] platform rtc_cmos: registered platform RTC device (no PNP device found)
[    0.090198] workingset: timestamp_bits=46 max_order=17 bucket_order=0
[    0.090273] SGI XFS with security attributes, no debug enabled
[    0.090354] 9p: Installing v9fs 9p2000 file system support
[    0.091348] NET: Registered protocol family 38
[    0.091356] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 249)
[    0.091357] io scheduler noop registered
[    0.091358] io scheduler deadline registered (default)
[    0.091367] io scheduler cfq registered
[    0.091368] io scheduler mq-deadline registered
[    0.091369] io scheduler kyber registered
[    0.091396] shpchp: Standard Hot Plug PCI Controller Driver version: 0.4
[    0.093467] nfit ACPI0012:00: found a zero length table '0' parsing nfit
[    0.093817] virtio-pci 0000:00:01.0: can't derive routing for PCI INT A
[    0.093819] virtio-pci 0000:00:01.0: PCI INT A: not connected
[    0.095027] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[    0.107672] console [hvc0] enabled
[    0.110137] brd: module loaded
[    0.112387] loop: module loaded
[    0.112695] pmem0: detected capacity change from 0 to 62914560
[    0.112844] tun: Universal TUN/TAP device driver, 1.6
[    0.112943] ixgbe: Intel(R) 10 Gigabit PCI Express Network Driver - version 5.1.0-k
[    0.113085] ixgbe: Copyright (c) 1999-2016 Intel Corporation.
[    0.113231] ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver - version 4.1.0-k
[    0.113335] ixgbevf: Copyright (c) 2009 - 2015 Intel Corporation.
[    0.113424] i40e: Intel(R) Ethernet Connection XL710 Network Driver - version 2.1.14-k
[    0.113511] i40e: Copyright (c) 2013 - 2014 Intel Corporation.
[    0.113602] i40evf: Intel(R) 40-10 Gigabit Virtual Function Network Driver - version 3.0.0-k
[    0.113703] Copyright (c) 2013 - 2015 Intel Corporation.
[    0.113804] VFIO - User Level meta-driver version: 0.3
[    0.113899] IR NEC protocol handler initialized
[    0.114026] IR RC5(x/sz) protocol handler initialized
[    0.114089] IR RC6 protocol handler initialized
[    0.114143] IR JVC protocol handler initialized
[    0.114203] IR Sony protocol handler initialized
[    0.114257] IR SANYO protocol handler initialized
[    0.114311] IR Sharp protocol handler initialized
[    0.114365] IR MCE Keyboard/mouse protocol handler initialized
[    0.114439] IR XMP protocol handler initialized
[    0.114562] device-mapper: ioctl: 4.37.0-ioctl (2017-09-20) initialised: dm-devel@redhat.com
[    0.114730] Netfilter messages via NETLINK v0.30.
[    0.114794] nfnl_acct: registering with nfnetlink.
[    0.114874] nf_conntrack version 0.5.0 (4096 buckets, 16384 max)
[    0.114996] ctnetlink v0.93: registering with nfnetlink.
[    0.115120] xt_time: kernel timezone is -0000
[    0.115185] ip_set: protocol 6
[    0.115237] IPVS: Registered protocols (TCP, UDP, SCTP, AH, ESP)
[    0.115308] IPVS: Connection hash table configured (size=4096, memory=64Kbytes)
[    0.115396] random: get_random_bytes called from ip_vs_conn_init+0xec/0xf1 with crng_init=0
[    0.115509] IPVS: ipvs loaded.
[    0.115558] IPVS: [rr] scheduler registered.
[    0.115617] IPVS: [wrr] scheduler registered.
[    0.115675] IPVS: [lc] scheduler registered.
[    0.115735] IPVS: [wlc] scheduler registered.
[    0.115794] IPVS: [fo] scheduler registered.
[    0.115852] IPVS: [ovf] scheduler registered.
[    0.115913] IPVS: [lblc] scheduler registered.
[    0.115982] IPVS: [lblcr] scheduler registered.
[    0.116070] IPVS: [dh] scheduler registered.
[    0.116132] IPVS: [sh] scheduler registered.
[    0.116190] IPVS: [sed] scheduler registered.
[    0.116249] IPVS: [nq] scheduler registered.
[    0.116309] IPVS: ftp: loaded support on port[0] = 21
[    0.116370] IPVS: [sip] pe registered.
[    0.116481] ip_tables: (C) 2000-2006 Netfilter Core Team
[    0.116564] ipt_CLUSTERIP: ClusterIP Version 0.8 loaded successfully
[    0.116639] arp_tables: arp_tables: (C) 2002 David S. Miller
[    0.116712] Initializing XFRM netlink socket
[    0.116793] NET: Registered protocol family 10
[    0.116986] Segment Routing with IPv6
[    0.117071] NET: Registered protocol family 17
[    0.117147] 9pnet: Installing 9P2000 support
[    0.117236] sched_clock: Marking stable (117048932, 0)->(211518473, -94469541)
[    0.117371] registered taskstats version 1
[    0.117675] EXT4-fs (pmem0): mounting ext2 file system using the ext4 subsystem
[    0.117901] EXT4-fs (pmem0): mounted filesystem without journal. Opts: (null)
[    0.118037] VFS: Mounted root (ext2 filesystem) readonly on device 259:0.
[    0.118198] devtmpfs: mounted
[    0.118610] Freeing unused kernel memory: 832K
[    0.120031] Write protecting the kernel read-only data: 16384k
[    0.120451] Freeing unused kernel memory: 2020K
[    0.122936] Freeing unused kernel memory: 1284K
[    0.125789] EXT4-fs (pmem0): warning: mounting unchecked fs, running e2fsck is recommended
[    0.126217] EXT4-fs (pmem0): re-mounted. Opts: block_validity,barrier,user_xattr,acl
[    0.134829] random: dd: uninitialized urandom read (512 bytes read)

[-- Attachment #4: qemu-perf-script.py --]
[-- Type: text/x-python, Size: 5160 bytes --]

# perf script event handlers, generated by perf script -g python
# Licensed under the terms of the GNU GPL License version 2

# The common_* event handler fields are the most useful fields common to
# all events.  They don't necessarily correspond to the 'common_*' fields
# in the format files.  Those fields not available as handler params can
# be retrieved using Python functions of the form common_*(context).
# See the perf-script-python Documentation for the list of available functions.

from __future__ import print_function

import os
import sys
import collections

sys.path.append(os.environ['PERF_EXEC_PATH'] + \
	'/scripts/python/Perf-Trace-Util/lib/Perf/Trace')

from perf_trace_context import *
from Core import *
from Util import *


class QemuTrace:
	APP_NAME = "qemu-system-x86"
	# IO ports for different exit points
	LINUX_EXIT_PORT = 0xf4
	FW_EXIT_PORT = 0xf5
	# Exit point values
	EXIT_POINTS = { 1 : 'fw_start',
			2 : 'linux_start_fwcfg',
			3 : 'linux_start_boot',
			4 : 'fw_do_boot',
			5 : 'linux_start_kernel',
			6 : 'linux_start_user'}

	def __init__(self):
		self.start = 0
		self.qemu_init_end = 0
		self.probes = []

	def print(self, div = 1):
		print(" qemu_init_end: %f" % \
				((self.qemu_init_end - float(self.start))/div))

		pre_ep = 0
		pre_ts = self.qemu_init_end
		for ep, ts in self.probes:
			if ep == pre_ep:
				continue
			if ep in QemuTrace.EXIT_POINTS:
				ep_name = QemuTrace.EXIT_POINTS[ep]
			else:
				ep_name = "Exit point " + str(ep)

			print(" {}: {} (+{})".format(ep_name, (ts - float(self.start))/div, \
				 (ts - float(pre_ts))/div))
			pre_ts = ts
			pre_ep = ep

	@staticmethod
	def stats(pids, traces, div):
		avgQT = QemuTrace()
		minQT = QemuTrace()
		maxQT = QemuTrace()

		count = 0

		for pid in pids:
			count += 1

			print("%d) pid %d" % (count, pid))
			traces[pid].print(div)

			qit = traces[pid].qemu_init_end - traces[pid].start
			avgQT.qemu_init_end += qit
			maxQT.qemu_init_end = max(maxQT.qemu_init_end, qit)

			if (count == 1):
				minQT.qemu_init_end = qit
			else:
				minQT.qemu_init_end = min(minQT.qemu_init_end, qit)

			i = 0
			for ep, ts in traces[pid].probes:

				rel_ts = ts - traces[pid].start

				if (count == 1):
					avgQT.probes.append((ep, rel_ts))
					minQT.probes.append((ep, rel_ts))
					maxQT.probes.append((ep, 0))
				else:
					avgQT.probes[i] = (ep, avgQT.probes[i][1] + rel_ts)
					minQT.probes[i] = (ep, min(minQT.probes[i][1], rel_ts))
					maxQT.probes[i] = (ep, max(maxQT.probes[i][1], rel_ts))

				i+=1


		if count > 1:
			avgQT.qemu_init_end /= count

			i = 0
			for ep, ts in avgQT.probes:
				avgQT.probes[i] = (ep, avgQT.probes[i][1] / count)
				i+=1


			print("\nAvg")
			avgQT.print(div)

			print("\nMin")
			minQT.print(div)

			print("\nMax")
			maxQT.print(div)

class Events:

	def __init__(self, ClassTrace):
		self.ClassTrace = ClassTrace
		self.app_name = ClassTrace.APP_NAME
		self.pids = []
		self.traces = autodict()

	def stats(self, div):
		self.ClassTrace.stats(self.pids, self.traces, div)



events = Events(QemuTrace)

def trace_begin():
	print("in trace_begin")

def trace_end():
	print("in trace_end")
	print("Trace %s" % (events.app_name))
	events.stats(1000000)

def sched__sched_process_exec(event_name, context, common_cpu,
	common_secs, common_nsecs, common_pid, common_comm,
	common_callchain, filename, pid, old_pid, perf_sample_dict):

	if (events.app_name != common_comm):
		return

	events.traces[pid] = QemuTrace()
	events.traces[pid].start = perf_sample_dict["sample"]["time"]
	events.pids.append(pid)


	print_header(event_name, common_cpu, common_secs, common_nsecs,
			common_pid, common_comm)
	print()



def kvm__kvm_pio(event_name, context, common_cpu,
	common_secs, common_nsecs, common_pid, common_comm,
	common_callchain, rw, port, size, count,
	val, perf_sample_dict):

	if (events.app_name != common_comm):
		return

	ts = perf_sample_dict["sample"]["time"]
	pid = perf_sample_dict["sample"]["pid"]

	if (port == QemuTrace.FW_EXIT_PORT or port == QemuTrace.LINUX_EXIT_PORT):
		events.traces[pid].probes.append((val, ts))
	else:
		return

	print_header(event_name, common_cpu, common_secs, common_nsecs,
			common_pid, common_comm)


	print("rw=%u, port=0x%x, size=%u, count=%u, val=%u" % \
			(rw, port, size, count, val))
	print()


def kvm__kvm_entry(event_name, context, common_cpu,
	common_secs, common_nsecs, common_pid, common_comm,
	common_callchain, vcpu_id, perf_sample_dict):

	if (events.app_name != common_comm):
		return

	ts = perf_sample_dict["sample"]["time"]
	pid = perf_sample_dict["sample"]["pid"]

	if (events.traces[pid].qemu_init_end != 0):
		return

	events.traces[pid].qemu_init_end = ts


	print_header(event_name, common_cpu, common_secs, common_nsecs,
			common_pid, common_comm)
	print()



def trace_unhandled(event_name, context, event_fields_dict, perf_sample_dict):
	pass

def print_header(event_name, cpu, secs, nsecs, pid, comm):
	print("%-20s %5u %05u.%09u %8u %-20s " % \
	(event_name, cpu, secs, nsecs, pid, comm), end="")

def get_dict_as_string(a_dict, delimiter=' '):
	return delimiter.join(['%s=%s'%(k,str(v))for k,v in sorted(a_dict.items())])

[-- Attachment #5: nemu-bench.patch --]
[-- Type: text/x-patch, Size: 1563 bytes --]

From 95d5bdb27fd6ca63356a618dc89820e5611b8497 Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Mon, 3 Dec 2018 17:04:04 +0100
Subject: [PATCH] benchmark: add exit points

---
 pc-bios/optionrom/benchmark.h     | 14 ++++++++++++++
 pc-bios/optionrom/linuxboot_dma.c |  3 +++
 2 files changed, 17 insertions(+)
 create mode 100644 pc-bios/optionrom/benchmark.h

diff --git a/pc-bios/optionrom/benchmark.h b/pc-bios/optionrom/benchmark.h
new file mode 100644
index 0000000000..38d0d88798
--- /dev/null
+++ b/pc-bios/optionrom/benchmark.h
@@ -0,0 +1,14 @@
+#ifndef BENCHMARK_H
+#define BENCHMARK_H
+
+/* IO ports for different exit points */
+#define LINUX_EXIT_PORT 0xf4
+#define FW_EXIT_PORT    0xf5
+
+/* Exit point values */
+#define FW_START    1
+#define LINUX_START_FWCFG 2
+#define LINUX_START_BOOT  3
+#define FW_DO_BOOT 4
+
+#endif
diff --git a/pc-bios/optionrom/linuxboot_dma.c b/pc-bios/optionrom/linuxboot_dma.c
index 4754282ad7..577199388f 100644
--- a/pc-bios/optionrom/linuxboot_dma.c
+++ b/pc-bios/optionrom/linuxboot_dma.c
@@ -59,6 +59,7 @@ asm(
 );
 
 #include "../../include/hw/nvram/fw_cfg_keys.h"
+#include "benchmark.h"
 
 /* QEMU_CFG_DMA_CONTROL bits */
 #define BIOS_CFG_DMA_CTL_ERROR   0x01
@@ -265,6 +266,8 @@ void load_kernel(void)
     segment_addr = ((uint32_t)setup_addr >> 4);
     stack_addr = (uint32_t)(cmdline_addr - setup_addr - 16);
 
+    outl(LINUX_START_BOOT, LINUX_EXIT_PORT);
+
     /* As we are changing critical registers, we cannot leave freedom to the
      * compiler.
      */
-- 
2.19.2


[-- Attachment #6: seabios-bench.patch --]
[-- Type: text/x-patch, Size: 2489 bytes --]

From 9064e438f6d11cad9d50d110009fa8871b3d91a9 Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Mon, 3 Dec 2018 16:52:16 +0100
Subject: [PATCH] benchmark: add exit points

---
 src/Kconfig     |  7 +++++++
 src/benchmark.h | 14 ++++++++++++++
 src/boot.c      |  5 +++++
 src/post.c      |  5 +++++
 4 files changed, 31 insertions(+)
 create mode 100644 src/benchmark.h

diff --git a/src/Kconfig b/src/Kconfig
index 55a87cb..7db0f9c 100644
--- a/src/Kconfig
+++ b/src/Kconfig
@@ -566,6 +566,13 @@ menu "Debugging"
             information by outputing strings in a special port present in the
             IO space.
 
+    config DEBUG_BENCHMARK
+        depends on QEMU_HARDWARE
+        bool "benchmark IO port debugging"
+        default y
+        help
+            Benchmark debug
+
     config DEBUG_COREBOOT
         depends on COREBOOT && DEBUG_LEVEL != 0
         bool "coreboot cbmem debug logging"
diff --git a/src/benchmark.h b/src/benchmark.h
new file mode 100644
index 0000000..38d0d88
--- /dev/null
+++ b/src/benchmark.h
@@ -0,0 +1,14 @@
+#ifndef BENCHMARK_H
+#define BENCHMARK_H
+
+/* IO ports for different exit points */
+#define LINUX_EXIT_PORT 0xf4
+#define FW_EXIT_PORT    0xf5
+
+/* Exit point values */
+#define FW_START    1
+#define LINUX_START_FWCFG 2
+#define LINUX_START_BOOT  3
+#define FW_DO_BOOT 4
+
+#endif
diff --git a/src/boot.c b/src/boot.c
index 9f82f3c..c6d8cc1 100644
--- a/src/boot.c
+++ b/src/boot.c
@@ -21,6 +21,7 @@
 #include "string.h" // memset
 #include "util.h" // irqtimer_calc
 #include "tcgbios.h" // tpm_*
+#include "benchmark.h"
 
 
 /****************************************************************
@@ -739,6 +740,10 @@ do_boot(int seq_nr)
     if (seq_nr >= BEVCount)
         boot_fail();
 
+#if CONFIG_DEBUG_BENCHMARK
+    outb(FW_DO_BOOT, FW_EXIT_PORT);
+#endif
+
     // Boot the given BEV type.
     struct bev_s *ie = &BEV[seq_nr];
     switch (ie->type) {
diff --git a/src/post.c b/src/post.c
index f93106a..7636d41 100644
--- a/src/post.c
+++ b/src/post.c
@@ -23,6 +23,7 @@
 #include "string.h" // memset
 #include "util.h" // kbd_init
 #include "tcgbios.h" // tpm_*
+#include "benchmark.h"
 
 
 /****************************************************************
@@ -323,6 +324,10 @@ handle_post(void)
     if (!CONFIG_QEMU && !CONFIG_COREBOOT)
         return;
 
+#if CONFIG_DEBUG_BENCHMARK
+    outb(FW_START, FW_EXIT_PORT);
+#endif
+
     serial_debug_preinit();
     debug_banner();
 
-- 
2.19.2


[-- Attachment #7: qboot-bench.patch --]
[-- Type: text/x-patch, Size: 1494 bytes --]

From f00bbd8da79e4c99bbc96efdf80805e16cf90ecc Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Mon, 3 Dec 2018 16:50:47 +0100
Subject: [PATCH] benchmark: add more trace point

---
 Makefile    | 2 +-
 benchmark.h | 3 +++
 main.c      | 5 ++++-
 3 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/Makefile b/Makefile
index 937b277..47e93ea 100644
--- a/Makefile
+++ b/Makefile
@@ -5,7 +5,7 @@ obj-y += pci.o
 all-y = bios.bin
 all: $(all-y)
 
-CFLAGS := -O2 -g
+CFLAGS := -O2 -g -DBENCHMARK_HACK
 
 BIOS_CFLAGS += $(autodepend-flags) -Wall
 BIOS_CFLAGS += -m32
diff --git a/benchmark.h b/benchmark.h
index 089c549..372a26b 100644
--- a/benchmark.h
+++ b/benchmark.h
@@ -9,5 +9,8 @@
 #define FW_START    1
 #define LINUX_START_FWCFG 2
 #define LINUX_START_BOOT  3
+#define FW_DO_BOOT 4
+#define LINUX_START_KERNEL 5
+#define LINUX_START_USER 6
 
 #endif
diff --git a/main.c b/main.c
index 699cc1a..0bc3707 100644
--- a/main.c
+++ b/main.c
@@ -91,7 +91,7 @@ static bool detect_cbfs_and_boot(void)
 int __attribute__ ((section (".text.startup"))) main(void)
 {
 #ifdef BENCHMARK_HACK
-       outb(FW_EXIT_PORT, FW_START);
+	outb(FW_EXIT_PORT, FW_START);
 #endif
 	setup_hw();
 
@@ -106,6 +106,9 @@ int __attribute__ ((section (".text.startup"))) main(void)
 	extract_acpi();
 	extract_e820();
 	// extract_smbios();
+#ifdef BENCHMARK_HACK
+	outb(FW_EXIT_PORT, FW_DO_BOOT);
+#endif
 	if (!detect_cbfs_and_boot())
 		boot_from_fwcfg();
 	panic();
-- 
2.19.2


[-- Attachment #8: linux-v4.14.67-bench.patch --]
[-- Type: text/x-patch, Size: 1646 bytes --]

From 5ffc99a9a0293cb68c34894201d54d86b6561906 Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Fri, 30 Nov 2018 16:56:28 +0100
Subject: [PATCH] boot: add debug port write to trace boot times

---
 init/benchmark.h | 16 ++++++++++++++++
 init/main.c      |  5 +++++
 2 files changed, 21 insertions(+)
 create mode 100644 init/benchmark.h

diff --git a/init/benchmark.h b/init/benchmark.h
new file mode 100644
index 000000000000..372a26b05ab1
--- /dev/null
+++ b/init/benchmark.h
@@ -0,0 +1,16 @@
+#ifndef BENCHMARK_H
+#define BENCHMARK_H
+
+/* IO ports for different exit points */
+#define LINUX_EXIT_PORT 0xf4
+#define FW_EXIT_PORT    0xf5
+
+/* Exit point values */
+#define FW_START    1
+#define LINUX_START_FWCFG 2
+#define LINUX_START_BOOT  3
+#define FW_DO_BOOT 4
+#define LINUX_START_KERNEL 5
+#define LINUX_START_USER 6
+
+#endif
diff --git a/init/main.c b/init/main.c
index c4a45145e102..05f9cc85d07d 100644
--- a/init/main.c
+++ b/init/main.c
@@ -10,6 +10,7 @@
  */
 
 #define DEBUG		/* Enable initcall_debug */
+#include "benchmark.h"
 
 #include <linux/types.h>
 #include <linux/extable.h>
@@ -515,6 +516,8 @@ asmlinkage __visible void __init start_kernel(void)
 	char *command_line;
 	char *after_dashes;
 
+	outb(LINUX_START_KERNEL, LINUX_EXIT_PORT);
+
 	set_task_stack_end_magic(&init_task);
 	smp_setup_processor_id();
 	debug_objects_early_init();
@@ -1008,6 +1011,8 @@ static int __ref kernel_init(void *unused)
 
 	rcu_end_inkernel_boot();
 
+	outb(LINUX_START_USER, LINUX_EXIT_PORT);
+
 	if (ramdisk_execute_command) {
 		ret = run_init_process(ramdisk_execute_command);
 		if (!ret)
-- 
2.19.2


^ permalink raw reply related	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-03 16:35         ` Stefano Garzarella
@ 2018-12-04 22:44           ` Maran Wilson
  2018-12-05 12:06             ` Stefano Garzarella
  2018-12-05 13:20             ` Stefan Hajnoczi
  2018-12-05 12:26           ` Philippe Mathieu-Daudé
  1 sibling, 2 replies; 18+ messages in thread
From: Maran Wilson @ 2018-12-04 22:44 UTC (permalink / raw)
  To: Stefano Garzarella, Rob Bradford
  Cc: Samuel Ortiz, Stefan Hajnoczi, qemu-devel, Liam Merwick

On 12/3/2018 8:35 AM, Stefano Garzarella wrote:
> On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
>> Hi Stefano, thanks for capturing all these numbers,
>>
>> On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
>>> Hi Rob,
>>> I continued to investigate the boot time, and as you suggested I
>>> looked also at qemu-lite 2.11.2
>>> (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
>>> did the following tests using the Kata kernel configuration
>>> (
>>> https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
>>> )
>>>
>>> To compare the results with qemu-lite direct kernel load, I added
>>> another tracepoint:
>>> - linux_start_kernel: first entry of the Linux kernel
>>> (start_kernel())
>>>
>> Great, do you have a set of patches available that all these trace
>> points. It would be great for reproduction.
> For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
> nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
> I'm also sharing a python script that I'm using with perf to extract
> the numbers in this way:
>
> $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
> sched:sched_process_exec -o /tmp/qemu_perf.data &
> $ # start qemu/nemu multiple times
> $ killall perf
> $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
>
>>> As you can see, NEMU is faster to jump to the kernel
>>> (linux_start_kernel) than qemu-lite when uses qboot or seabios with
>>> virt support, but the time to the user space is strangely high, maybe
>>> the kernel configuration that I used is not the best one.
>>> Do you suggest another kernel configuration?
>>>
>> This looks very bad. This isn't the kernel configuration we normally
>> test with in our automated test system but is definitely one we support
>> as part of our partnernship with the Kata team. It's a high priority
>> for me to try and investigate that. Have you saved the kernel messages
>> as they might be helpful?
> Yes, I'm attaching the dmesg output with nemu and qemu.
>
>>> Anyway, I obtained the best boot time with qemu-lite and direct
>>> kernel
>>> load (vmlinux ELF image). I think because the kernel was not
>>> compressed. Indeed, looking to the others test, the kernel
>>> decompression (bzImage) takes about 80 ms (linux_start_kernel -
>>> linux_start_boot). (I'll investigate better)
>>>
>> Yup being able to load an uncompressed kernel is one of the big
>> advantages of qemu-lite. I wonder if we could bring that feature into
>> qemu itself to supplement the existing firmware based kernel loading.
> I think so, I'll try to understand if we can merge the qemu-lite
> direct kernel loading in qemu.

An attempt was made a long time ago to push the qemu-lite stuff (from 
the Intel Clear Containers project) upstream. As I understand it, the 
main stumbling block that seemed to derail the effort was that it 
involved adding Linux OS specific code to Qemu so that Qemu could do 
things like create and populate the zero page that Linux expects when 
entering startup_64().

That ends up being a lot of very low-level, operating specific knowledge 
about Linux that ends up getting baked into Qemu code. And 
understandably, a number of folks saw problems with going down a path 
like that.

Since then, we have put together an alternative solution that would 
allow Qemu to boot an uncompressed Linux binary via the x86/HVM direct 
boot ABI (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The 
solution involves first making changes to both the ABI as well as Linux, 
and then updating Qemu to take advantage of the updated ABI which is 
already supported by both Linux and Free BSD for booting VMs. As such, 
Qemu can remain OS agnostic, and just be programmed to the published ABI.

The canonical definition for the HVM direct boot ABI is in the Xen tree 
and we needed to make some minor changes to the ABI definition to allow 
KVM guests to also use the same structure and entry point. Those changes 
were accepted to the Xen tree already:
https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html

The corresponding Linux changes that would allow KVM guests to be booted 
via this PVH entry point have already been posted and reviewed:
https://lkml.org/lkml/2018/4/16/1002

The final part is the set of Qemu changes to take advantage of the above 
and boot a KVM guest via an uncompressed kernel binary using the entry 
point defined by the ABI. Liam Merwick will be posting some RFC patches 
very soon to allow this.

Thanks,
-Maran

>
> Thanks,
> Stefano
>
>

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-04 22:44           ` Maran Wilson
@ 2018-12-05 12:06             ` Stefano Garzarella
  2018-12-05 13:20             ` Stefan Hajnoczi
  1 sibling, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2018-12-05 12:06 UTC (permalink / raw)
  To: maran.wilson
  Cc: Rob Bradford, Samuel Ortiz, Stefan Hajnoczi, qemu-devel, liam.merwick

On Tue, Dec 4, 2018 at 11:45 PM Maran Wilson <maran.wilson@oracle.com> wrote:
>
> On 12/3/2018 8:35 AM, Stefano Garzarella wrote:
> > On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
> >> Hi Stefano, thanks for capturing all these numbers,
> >>
> >> On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
> >>> Hi Rob,
> >>> I continued to investigate the boot time, and as you suggested I
> >>> looked also at qemu-lite 2.11.2
> >>> (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> >>> did the following tests using the Kata kernel configuration
> >>> (
> >>> https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
> >>> )
> >>>
> >>> To compare the results with qemu-lite direct kernel load, I added
> >>> another tracepoint:
> >>> - linux_start_kernel: first entry of the Linux kernel
> >>> (start_kernel())
> >>>
> >> Great, do you have a set of patches available that all these trace
> >> points. It would be great for reproduction.
> > For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
> > nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
> > I'm also sharing a python script that I'm using with perf to extract
> > the numbers in this way:
> >
> > $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
> > sched:sched_process_exec -o /tmp/qemu_perf.data &
> > $ # start qemu/nemu multiple times
> > $ killall perf
> > $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
> >
> >>> As you can see, NEMU is faster to jump to the kernel
> >>> (linux_start_kernel) than qemu-lite when uses qboot or seabios with
> >>> virt support, but the time to the user space is strangely high, maybe
> >>> the kernel configuration that I used is not the best one.
> >>> Do you suggest another kernel configuration?
> >>>
> >> This looks very bad. This isn't the kernel configuration we normally
> >> test with in our automated test system but is definitely one we support
> >> as part of our partnernship with the Kata team. It's a high priority
> >> for me to try and investigate that. Have you saved the kernel messages
> >> as they might be helpful?
> > Yes, I'm attaching the dmesg output with nemu and qemu.
> >
> >>> Anyway, I obtained the best boot time with qemu-lite and direct
> >>> kernel
> >>> load (vmlinux ELF image). I think because the kernel was not
> >>> compressed. Indeed, looking to the others test, the kernel
> >>> decompression (bzImage) takes about 80 ms (linux_start_kernel -
> >>> linux_start_boot). (I'll investigate better)
> >>>
> >> Yup being able to load an uncompressed kernel is one of the big
> >> advantages of qemu-lite. I wonder if we could bring that feature into
> >> qemu itself to supplement the existing firmware based kernel loading.
> > I think so, I'll try to understand if we can merge the qemu-lite
> > direct kernel loading in qemu.
>
> An attempt was made a long time ago to push the qemu-lite stuff (from
> the Intel Clear Containers project) upstream. As I understand it, the
> main stumbling block that seemed to derail the effort was that it
> involved adding Linux OS specific code to Qemu so that Qemu could do
> things like create and populate the zero page that Linux expects when
> entering startup_64().
>
> That ends up being a lot of very low-level, operating specific knowledge
> about Linux that ends up getting baked into Qemu code. And
> understandably, a number of folks saw problems with going down a path
> like that.
>
> Since then, we have put together an alternative solution that would
> allow Qemu to boot an uncompressed Linux binary via the x86/HVM direct
> boot ABI (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The
> solution involves first making changes to both the ABI as well as Linux,
> and then updating Qemu to take advantage of the updated ABI which is
> already supported by both Linux and Free BSD for booting VMs. As such,
> Qemu can remain OS agnostic, and just be programmed to the published ABI.
>
> The canonical definition for the HVM direct boot ABI is in the Xen tree
> and we needed to make some minor changes to the ABI definition to allow
> KVM guests to also use the same structure and entry point. Those changes
> were accepted to the Xen tree already:
> https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html
>
> The corresponding Linux changes that would allow KVM guests to be booted
> via this PVH entry point have already been posted and reviewed:
> https://lkml.org/lkml/2018/4/16/1002
>
> The final part is the set of Qemu changes to take advantage of the above
> and boot a KVM guest via an uncompressed kernel binary using the entry
> point defined by the ABI. Liam Merwick will be posting some RFC patches
> very soon to allow this.

Hi Maran,
thank you very much for the details!

I'll take a look to the HVM direct boot ABI.

Thanks,
Stefano

>
> Thanks,
> -Maran
>
> >
> > Thanks,
> > Stefano
> >
> >
>


-- 
Stefano Garzarella
Red Hat

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-03 16:35         ` Stefano Garzarella
  2018-12-04 22:44           ` Maran Wilson
@ 2018-12-05 12:26           ` Philippe Mathieu-Daudé
  2018-12-05 16:23             ` Stefano Garzarella
  1 sibling, 1 reply; 18+ messages in thread
From: Philippe Mathieu-Daudé @ 2018-12-05 12:26 UTC (permalink / raw)
  To: Stefano Garzarella, Rob Bradford
  Cc: Samuel Ortiz, Stefan Hajnoczi, qemu-devel

Hi Stefano,

On 3/12/18 17:35, Stefano Garzarella wrote:
> On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
>>
>> Hi Stefano, thanks for capturing all these numbers,
>>
>> On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
>>> Hi Rob,
>>> I continued to investigate the boot time, and as you suggested I
>>> looked also at qemu-lite 2.11.2
>>> (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
>>> did the following tests using the Kata kernel configuration
>>> (
>>> https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
>>> )
>>>
>>> To compare the results with qemu-lite direct kernel load, I added
>>> another tracepoint:
>>> - linux_start_kernel: first entry of the Linux kernel
>>> (start_kernel())
>>>
>>
>> Great, do you have a set of patches available that all these trace
>> points. It would be great for reproduction.
> 
> For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
> nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
> I'm also sharing a python script that I'm using with perf to extract
> the numbers in this way:
> 
> $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
> sched:sched_process_exec -o /tmp/qemu_perf.data &
> $ # start qemu/nemu multiple times
> $ killall perf
> $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
[...]
Good stuff. At some point I'd like to have such scripts in the QEMU
repository to run perf tests on a regular basis (eventually keep those
metrics somewhere) to be able to catch when we add code that add
timeoutes and increase boot time.

Regards,

Phil.

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-04 22:44           ` Maran Wilson
  2018-12-05 12:06             ` Stefano Garzarella
@ 2018-12-05 13:20             ` Stefan Hajnoczi
  2018-12-05 14:19               ` Boris Ostrovsky
  2018-12-05 18:04               ` Maran Wilson
  1 sibling, 2 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2018-12-05 13:20 UTC (permalink / raw)
  To: Maran Wilson
  Cc: Stefano Garzarella, Rob Bradford, Samuel Ortiz, qemu-devel, Liam Merwick

[-- Attachment #1: Type: text/plain, Size: 5412 bytes --]

On Tue, Dec 04, 2018 at 02:44:33PM -0800, Maran Wilson wrote:
> On 12/3/2018 8:35 AM, Stefano Garzarella wrote:
> > On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
> > > Hi Stefano, thanks for capturing all these numbers,
> > > 
> > > On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
> > > > Hi Rob,
> > > > I continued to investigate the boot time, and as you suggested I
> > > > looked also at qemu-lite 2.11.2
> > > > (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> > > > did the following tests using the Kata kernel configuration
> > > > (
> > > > https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
> > > > )
> > > > 
> > > > To compare the results with qemu-lite direct kernel load, I added
> > > > another tracepoint:
> > > > - linux_start_kernel: first entry of the Linux kernel
> > > > (start_kernel())
> > > > 
> > > Great, do you have a set of patches available that all these trace
> > > points. It would be great for reproduction.
> > For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
> > nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
> > I'm also sharing a python script that I'm using with perf to extract
> > the numbers in this way:
> > 
> > $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
> > sched:sched_process_exec -o /tmp/qemu_perf.data &
> > $ # start qemu/nemu multiple times
> > $ killall perf
> > $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
> > 
> > > > As you can see, NEMU is faster to jump to the kernel
> > > > (linux_start_kernel) than qemu-lite when uses qboot or seabios with
> > > > virt support, but the time to the user space is strangely high, maybe
> > > > the kernel configuration that I used is not the best one.
> > > > Do you suggest another kernel configuration?
> > > > 
> > > This looks very bad. This isn't the kernel configuration we normally
> > > test with in our automated test system but is definitely one we support
> > > as part of our partnernship with the Kata team. It's a high priority
> > > for me to try and investigate that. Have you saved the kernel messages
> > > as they might be helpful?
> > Yes, I'm attaching the dmesg output with nemu and qemu.
> > 
> > > > Anyway, I obtained the best boot time with qemu-lite and direct
> > > > kernel
> > > > load (vmlinux ELF image). I think because the kernel was not
> > > > compressed. Indeed, looking to the others test, the kernel
> > > > decompression (bzImage) takes about 80 ms (linux_start_kernel -
> > > > linux_start_boot). (I'll investigate better)
> > > > 
> > > Yup being able to load an uncompressed kernel is one of the big
> > > advantages of qemu-lite. I wonder if we could bring that feature into
> > > qemu itself to supplement the existing firmware based kernel loading.
> > I think so, I'll try to understand if we can merge the qemu-lite
> > direct kernel loading in qemu.
> 
> An attempt was made a long time ago to push the qemu-lite stuff (from the
> Intel Clear Containers project) upstream. As I understand it, the main
> stumbling block that seemed to derail the effort was that it involved adding
> Linux OS specific code to Qemu so that Qemu could do things like create and
> populate the zero page that Linux expects when entering startup_64().
> 
> That ends up being a lot of very low-level, operating specific knowledge
> about Linux that ends up getting baked into Qemu code. And understandably, a
> number of folks saw problems with going down a path like that.
> 
> Since then, we have put together an alternative solution that would allow
> Qemu to boot an uncompressed Linux binary via the x86/HVM direct boot ABI
> (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The solution involves
> first making changes to both the ABI as well as Linux, and then updating
> Qemu to take advantage of the updated ABI which is already supported by both
> Linux and Free BSD for booting VMs. As such, Qemu can remain OS agnostic,
> and just be programmed to the published ABI.
> 
> The canonical definition for the HVM direct boot ABI is in the Xen tree and
> we needed to make some minor changes to the ABI definition to allow KVM
> guests to also use the same structure and entry point. Those changes were
> accepted to the Xen tree already:
> https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html
> 
> The corresponding Linux changes that would allow KVM guests to be booted via
> this PVH entry point have already been posted and reviewed:
> https://lkml.org/lkml/2018/4/16/1002
> 
> The final part is the set of Qemu changes to take advantage of the above and
> boot a KVM guest via an uncompressed kernel binary using the entry point
> defined by the ABI. Liam Merwick will be posting some RFC patches very soon
> to allow this.

Cool, thanks for doing this work!

How do the boot times compare to qemu-lite and Firecracker's
(https://github.com/firecracker-microvm/firecracker/) direct vmlinux ELF
boot?

I'm asking because there are several custom approaches to fast kernel
boot and we should make sure that whatever Linux and QEMU end up
natively supporting is likely to work for all projects (NEMU, qemu-lite,
Firecracker) and operating systems (Linux distros, other OSes).

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-05 13:20             ` Stefan Hajnoczi
@ 2018-12-05 14:19               ` Boris Ostrovsky
  2018-12-05 18:04               ` Maran Wilson
  1 sibling, 0 replies; 18+ messages in thread
From: Boris Ostrovsky @ 2018-12-05 14:19 UTC (permalink / raw)
  To: Stefan Hajnoczi, Maran Wilson
  Cc: qemu-devel, Samuel Ortiz, Rob Bradford, Stefano Garzarella

[-- Attachment #1: Type: text/plain, Size: 2289 bytes --]

On 12/5/18 8:20 AM, Stefan Hajnoczi wrote:
> On Tue, Dec 04, 2018 at 02:44:33PM -0800, Maran Wilson wrote:
>>
>> Since then, we have put together an alternative solution that would allow
>> Qemu to boot an uncompressed Linux binary via the x86/HVM direct boot ABI
>> (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The solution involves
>> first making changes to both the ABI as well as Linux, and then updating
>> Qemu to take advantage of the updated ABI which is already supported by both
>> Linux and Free BSD for booting VMs. As such, Qemu can remain OS agnostic,
>> and just be programmed to the published ABI.
>>
>> The canonical definition for the HVM direct boot ABI is in the Xen tree and
>> we needed to make some minor changes to the ABI definition to allow KVM
>> guests to also use the same structure and entry point. Those changes were
>> accepted to the Xen tree already:
>> https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html
>>
>> The corresponding Linux changes that would allow KVM guests to be booted via
>> this PVH entry point have already been posted and reviewed:
>> https://lkml.org/lkml/2018/4/16/1002
>>
>> The final part is the set of Qemu changes to take advantage of the above and
>> boot a KVM guest via an uncompressed kernel binary using the entry point
>> defined by the ABI. Liam Merwick will be posting some RFC patches very soon
>> to allow this.
> Cool, thanks for doing this work!
>
> How do the boot times compare to qemu-lite and Firecracker's
> (https://github.com/firecracker-microvm/firecracker/) direct vmlinux ELF
> boot?
>
> I'm asking because there are several custom approaches to fast kernel
> boot and we should make sure that whatever Linux and QEMU end up
> natively supporting is likely to work for all projects (NEMU, qemu-lite,
> Firecracker) and operating systems (Linux distros, other OSes).


I should also add that effort is under way to add support for booting
PVH guests from grub. This is currently Xen guests only (obviously) but
since it's based on the ABI that Maran mentioned above I don't see why a
non-Xen guest can't be supported as well.

v6 is here:
https://lists.xenproject.org/archives/html/xen-devel/2018-11/msg03174.html


-boris



[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-05 12:26           ` Philippe Mathieu-Daudé
@ 2018-12-05 16:23             ` Stefano Garzarella
  0 siblings, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2018-12-05 16:23 UTC (permalink / raw)
  To: philmd; +Cc: Rob Bradford, Samuel Ortiz, Stefan Hajnoczi, qemu-devel

Hi Philippe,

On Wed, Dec 5, 2018 at 1:26 PM Philippe Mathieu-Daudé <philmd@redhat.com> wrote:
>
> Hi Stefano,
>
> On 3/12/18 17:35, Stefano Garzarella wrote:
> > On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
> >>
> >> Hi Stefano, thanks for capturing all these numbers,
> >>
> >> On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
> >>> Hi Rob,
> >>> I continued to investigate the boot time, and as you suggested I
> >>> looked also at qemu-lite 2.11.2
> >>> (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> >>> did the following tests using the Kata kernel configuration
> >>> (
> >>> https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
> >>> )
> >>>
> >>> To compare the results with qemu-lite direct kernel load, I added
> >>> another tracepoint:
> >>> - linux_start_kernel: first entry of the Linux kernel
> >>> (start_kernel())
> >>>
> >>
> >> Great, do you have a set of patches available that all these trace
> >> points. It would be great for reproduction.
> >
> > For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
> > nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
> > I'm also sharing a python script that I'm using with perf to extract
> > the numbers in this way:
> >
> > $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
> > sched:sched_process_exec -o /tmp/qemu_perf.data &
> > $ # start qemu/nemu multiple times
> > $ killall perf
> > $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
> [...]
> Good stuff. At some point I'd like to have such scripts in the QEMU
> repository to run perf tests on a regular basis (eventually keep those
> metrics somewhere) to be able to catch when we add code that add
> timeoutes and increase boot time.

Thanks for your feedback, maybe we need a version of SeaBIOS and Linux
kernel compiled with the probes.
I'll keep you update.

Cheers,
Stefano

>
> Regards,
>
> Phil.



--
Stefano Garzarella
Red Hat

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-05 13:20             ` Stefan Hajnoczi
  2018-12-05 14:19               ` Boris Ostrovsky
@ 2018-12-05 18:04               ` Maran Wilson
  2018-12-06 10:38                 ` Stefan Hajnoczi
  2018-12-10 13:46                 ` Stefano Garzarella
  1 sibling, 2 replies; 18+ messages in thread
From: Maran Wilson @ 2018-12-05 18:04 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, Samuel Ortiz, Rob Bradford, Stefano Garzarella,
	Maran Wilson, Liam Merwick

On 12/5/2018 5:20 AM, Stefan Hajnoczi wrote:
> On Tue, Dec 04, 2018 at 02:44:33PM -0800, Maran Wilson wrote:
>> On 12/3/2018 8:35 AM, Stefano Garzarella wrote:
>>> On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
>>>> Hi Stefano, thanks for capturing all these numbers,
>>>>
>>>> On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
>>>>> Hi Rob,
>>>>> I continued to investigate the boot time, and as you suggested I
>>>>> looked also at qemu-lite 2.11.2
>>>>> (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
>>>>> did the following tests using the Kata kernel configuration
>>>>> (
>>>>> https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
>>>>> )
>>>>>
>>>>> To compare the results with qemu-lite direct kernel load, I added
>>>>> another tracepoint:
>>>>> - linux_start_kernel: first entry of the Linux kernel
>>>>> (start_kernel())
>>>>>
>>>> Great, do you have a set of patches available that all these trace
>>>> points. It would be great for reproduction.
>>> For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
>>> nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
>>> I'm also sharing a python script that I'm using with perf to extract
>>> the numbers in this way:
>>>
>>> $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
>>> sched:sched_process_exec -o /tmp/qemu_perf.data &
>>> $ # start qemu/nemu multiple times
>>> $ killall perf
>>> $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
>>>
>>>>> As you can see, NEMU is faster to jump to the kernel
>>>>> (linux_start_kernel) than qemu-lite when uses qboot or seabios with
>>>>> virt support, but the time to the user space is strangely high, maybe
>>>>> the kernel configuration that I used is not the best one.
>>>>> Do you suggest another kernel configuration?
>>>>>
>>>> This looks very bad. This isn't the kernel configuration we normally
>>>> test with in our automated test system but is definitely one we support
>>>> as part of our partnernship with the Kata team. It's a high priority
>>>> for me to try and investigate that. Have you saved the kernel messages
>>>> as they might be helpful?
>>> Yes, I'm attaching the dmesg output with nemu and qemu.
>>>
>>>>> Anyway, I obtained the best boot time with qemu-lite and direct
>>>>> kernel
>>>>> load (vmlinux ELF image). I think because the kernel was not
>>>>> compressed. Indeed, looking to the others test, the kernel
>>>>> decompression (bzImage) takes about 80 ms (linux_start_kernel -
>>>>> linux_start_boot). (I'll investigate better)
>>>>>
>>>> Yup being able to load an uncompressed kernel is one of the big
>>>> advantages of qemu-lite. I wonder if we could bring that feature into
>>>> qemu itself to supplement the existing firmware based kernel loading.
>>> I think so, I'll try to understand if we can merge the qemu-lite
>>> direct kernel loading in qemu.
>> An attempt was made a long time ago to push the qemu-lite stuff (from the
>> Intel Clear Containers project) upstream. As I understand it, the main
>> stumbling block that seemed to derail the effort was that it involved adding
>> Linux OS specific code to Qemu so that Qemu could do things like create and
>> populate the zero page that Linux expects when entering startup_64().
>>
>> That ends up being a lot of very low-level, operating specific knowledge
>> about Linux that ends up getting baked into Qemu code. And understandably, a
>> number of folks saw problems with going down a path like that.
>>
>> Since then, we have put together an alternative solution that would allow
>> Qemu to boot an uncompressed Linux binary via the x86/HVM direct boot ABI
>> (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The solution involves
>> first making changes to both the ABI as well as Linux, and then updating
>> Qemu to take advantage of the updated ABI which is already supported by both
>> Linux and Free BSD for booting VMs. As such, Qemu can remain OS agnostic,
>> and just be programmed to the published ABI.
>>
>> The canonical definition for the HVM direct boot ABI is in the Xen tree and
>> we needed to make some minor changes to the ABI definition to allow KVM
>> guests to also use the same structure and entry point. Those changes were
>> accepted to the Xen tree already:
>> https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html
>>
>> The corresponding Linux changes that would allow KVM guests to be booted via
>> this PVH entry point have already been posted and reviewed:
>> https://lkml.org/lkml/2018/4/16/1002
>>
>> The final part is the set of Qemu changes to take advantage of the above and
>> boot a KVM guest via an uncompressed kernel binary using the entry point
>> defined by the ABI. Liam Merwick will be posting some RFC patches very soon
>> to allow this.
> Cool, thanks for doing this work!
>
> How do the boot times compare to qemu-lite and Firecracker's
> (https://github.com/firecracker-microvm/firecracker/) direct vmlinux ELF
> boot?

Boot times compare very favorably to qemu-lite, since the end result is 
basically doing a very similar thing. For now, we are going with a QEMU 
+ qboot solution to introduce the PVH entry support in Qemu (meaning we 
will be posting Qemu and qboot patches and you will need both to boot an 
uncompressed kernel binary). As such we have numbers that Liam will 
include in the cover letter showing significant boot time improvement 
over existing QEMU + qboot approaches involving a compressed kernel 
binary. And as we all know, the existing qboot approach already gets 
boot times down pretty low.

Once the patches have been posted (soon) it would be great if some other 
folks could pick them up and run your own numbers on various test setups 
and comparisons you already have.

I haven't tried Firecracker, specifically. It would be good to see a 
comparison just so we know where we stand, but it's not terribly 
relevant to folks who want to continue using Qemu right? Meaning Qemu 
(and all solutions built on it like kata) still needs a solution for 
improving boot time regardless of what NEMU and Firecracker are doing.

And from what I've read so far, Firecracker only supports Linux guests. 
So one could arguably just bake in all sorts of Linux specific knowledge 
into it and have it lay things out like zero page right in the VMM code 
right?

I don't know off-hand, but is that how Firecracker boots an uncompressed 
Linux kernel? Anyone know?

Thanks,
-Maran

> I'm asking because there are several custom approaches to fast kernel
> boot and we should make sure that whatever Linux and QEMU end up
> natively supporting is likely to work for all projects (NEMU, qemu-lite,
> Firecracker) and operating systems (Linux distros, other OSes).
>
> Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-05 18:04               ` Maran Wilson
@ 2018-12-06 10:38                 ` Stefan Hajnoczi
  2018-12-06 14:47                   ` Maran Wilson
  2018-12-10 13:46                 ` Stefano Garzarella
  1 sibling, 1 reply; 18+ messages in thread
From: Stefan Hajnoczi @ 2018-12-06 10:38 UTC (permalink / raw)
  To: Maran Wilson
  Cc: qemu-devel, Samuel Ortiz, Rob Bradford, Stefano Garzarella, Liam Merwick

[-- Attachment #1: Type: text/plain, Size: 7961 bytes --]

On Wed, Dec 05, 2018 at 10:04:36AM -0800, Maran Wilson wrote:
> On 12/5/2018 5:20 AM, Stefan Hajnoczi wrote:
> > On Tue, Dec 04, 2018 at 02:44:33PM -0800, Maran Wilson wrote:
> > > On 12/3/2018 8:35 AM, Stefano Garzarella wrote:
> > > > On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
> > > > > Hi Stefano, thanks for capturing all these numbers,
> > > > > 
> > > > > On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
> > > > > > Hi Rob,
> > > > > > I continued to investigate the boot time, and as you suggested I
> > > > > > looked also at qemu-lite 2.11.2
> > > > > > (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> > > > > > did the following tests using the Kata kernel configuration
> > > > > > (
> > > > > > https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
> > > > > > )
> > > > > > 
> > > > > > To compare the results with qemu-lite direct kernel load, I added
> > > > > > another tracepoint:
> > > > > > - linux_start_kernel: first entry of the Linux kernel
> > > > > > (start_kernel())
> > > > > > 
> > > > > Great, do you have a set of patches available that all these trace
> > > > > points. It would be great for reproduction.
> > > > For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
> > > > nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
> > > > I'm also sharing a python script that I'm using with perf to extract
> > > > the numbers in this way:
> > > > 
> > > > $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
> > > > sched:sched_process_exec -o /tmp/qemu_perf.data &
> > > > $ # start qemu/nemu multiple times
> > > > $ killall perf
> > > > $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
> > > > 
> > > > > > As you can see, NEMU is faster to jump to the kernel
> > > > > > (linux_start_kernel) than qemu-lite when uses qboot or seabios with
> > > > > > virt support, but the time to the user space is strangely high, maybe
> > > > > > the kernel configuration that I used is not the best one.
> > > > > > Do you suggest another kernel configuration?
> > > > > > 
> > > > > This looks very bad. This isn't the kernel configuration we normally
> > > > > test with in our automated test system but is definitely one we support
> > > > > as part of our partnernship with the Kata team. It's a high priority
> > > > > for me to try and investigate that. Have you saved the kernel messages
> > > > > as they might be helpful?
> > > > Yes, I'm attaching the dmesg output with nemu and qemu.
> > > > 
> > > > > > Anyway, I obtained the best boot time with qemu-lite and direct
> > > > > > kernel
> > > > > > load (vmlinux ELF image). I think because the kernel was not
> > > > > > compressed. Indeed, looking to the others test, the kernel
> > > > > > decompression (bzImage) takes about 80 ms (linux_start_kernel -
> > > > > > linux_start_boot). (I'll investigate better)
> > > > > > 
> > > > > Yup being able to load an uncompressed kernel is one of the big
> > > > > advantages of qemu-lite. I wonder if we could bring that feature into
> > > > > qemu itself to supplement the existing firmware based kernel loading.
> > > > I think so, I'll try to understand if we can merge the qemu-lite
> > > > direct kernel loading in qemu.
> > > An attempt was made a long time ago to push the qemu-lite stuff (from the
> > > Intel Clear Containers project) upstream. As I understand it, the main
> > > stumbling block that seemed to derail the effort was that it involved adding
> > > Linux OS specific code to Qemu so that Qemu could do things like create and
> > > populate the zero page that Linux expects when entering startup_64().
> > > 
> > > That ends up being a lot of very low-level, operating specific knowledge
> > > about Linux that ends up getting baked into Qemu code. And understandably, a
> > > number of folks saw problems with going down a path like that.
> > > 
> > > Since then, we have put together an alternative solution that would allow
> > > Qemu to boot an uncompressed Linux binary via the x86/HVM direct boot ABI
> > > (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The solution involves
> > > first making changes to both the ABI as well as Linux, and then updating
> > > Qemu to take advantage of the updated ABI which is already supported by both
> > > Linux and Free BSD for booting VMs. As such, Qemu can remain OS agnostic,
> > > and just be programmed to the published ABI.
> > > 
> > > The canonical definition for the HVM direct boot ABI is in the Xen tree and
> > > we needed to make some minor changes to the ABI definition to allow KVM
> > > guests to also use the same structure and entry point. Those changes were
> > > accepted to the Xen tree already:
> > > https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html
> > > 
> > > The corresponding Linux changes that would allow KVM guests to be booted via
> > > this PVH entry point have already been posted and reviewed:
> > > https://lkml.org/lkml/2018/4/16/1002
> > > 
> > > The final part is the set of Qemu changes to take advantage of the above and
> > > boot a KVM guest via an uncompressed kernel binary using the entry point
> > > defined by the ABI. Liam Merwick will be posting some RFC patches very soon
> > > to allow this.
> > Cool, thanks for doing this work!
> > 
> > How do the boot times compare to qemu-lite and Firecracker's
> > (https://github.com/firecracker-microvm/firecracker/) direct vmlinux ELF
> > boot?
> 
> Boot times compare very favorably to qemu-lite, since the end result is
> basically doing a very similar thing. For now, we are going with a QEMU +
> qboot solution to introduce the PVH entry support in Qemu (meaning we will
> be posting Qemu and qboot patches and you will need both to boot an
> uncompressed kernel binary). As such we have numbers that Liam will include
> in the cover letter showing significant boot time improvement over existing
> QEMU + qboot approaches involving a compressed kernel binary. And as we all
> know, the existing qboot approach already gets boot times down pretty low.

The first email in this thread contains benchmark results showing that
optimized SeaBIOS is comparable to qboot, so it does not offer anything
unique with respect to boot time.

We're trying to focus on SeaBIOS because it's actively maintained and
already shipped by distros.  Relying on qboot will make it harder to get
PVH into the hands of users because distros have to package and ship
qboot first.  This might also require users to change their QEMU
command-line syntax to benefit from fast kernel booting.

I would strongly recommend looking at the SeaBIOS -kernel approach and
avoiding QEMU command-line changes.  That way -kernel becomes fast for
users as soon as they upgrade their QEMU without requiring configuration
changes.

If you have questions about how the -kernel boot works with SeaBIOS,
Stefano can help explain it and share details of his
development/benchmarking environment (see also earlier mails in this
email thread).

> Once the patches have been posted (soon) it would be great if some other
> folks could pick them up and run your own numbers on various test setups and
> comparisons you already have.
> 
> I haven't tried Firecracker, specifically. It would be good to see a
> comparison just so we know where we stand, but it's not terribly relevant to
> folks who want to continue using Qemu right? Meaning Qemu (and all solutions
> built on it like kata) still needs a solution for improving boot time
> regardless of what NEMU and Firecracker are doing.

Right.  Collaboration with Firecracker is more in the interest of
avoiding duplication and making it easy for users to fast boot a single
kernel (thanks to a common ABI like PVH) on any hypervisor.

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-06 10:38                 ` Stefan Hajnoczi
@ 2018-12-06 14:47                   ` Maran Wilson
  2018-12-07 10:02                     ` Stefan Hajnoczi
  0 siblings, 1 reply; 18+ messages in thread
From: Maran Wilson @ 2018-12-06 14:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Stefano Garzarella, qemu-devel, Rob Bradford, Samuel Ortiz

On 12/6/2018 2:38 AM, Stefan Hajnoczi wrote:
> On Wed, Dec 05, 2018 at 10:04:36AM -0800, Maran Wilson wrote:
>> On 12/5/2018 5:20 AM, Stefan Hajnoczi wrote:
>>> On Tue, Dec 04, 2018 at 02:44:33PM -0800, Maran Wilson wrote:
>>>> On 12/3/2018 8:35 AM, Stefano Garzarella wrote:
>>>>> On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
>>>>>> Hi Stefano, thanks for capturing all these numbers,
>>>>>>
>>>>>> On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
>>>>>>> Hi Rob,
>>>>>>> I continued to investigate the boot time, and as you suggested I
>>>>>>> looked also at qemu-lite 2.11.2
>>>>>>> (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
>>>>>>> did the following tests using the Kata kernel configuration
>>>>>>> (
>>>>>>> https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
>>>>>>> )
>>>>>>>
>>>>>>> To compare the results with qemu-lite direct kernel load, I added
>>>>>>> another tracepoint:
>>>>>>> - linux_start_kernel: first entry of the Linux kernel
>>>>>>> (start_kernel())
>>>>>>>
>>>>>> Great, do you have a set of patches available that all these trace
>>>>>> points. It would be great for reproduction.
>>>>> For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
>>>>> nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
>>>>> I'm also sharing a python script that I'm using with perf to extract
>>>>> the numbers in this way:
>>>>>
>>>>> $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
>>>>> sched:sched_process_exec -o /tmp/qemu_perf.data &
>>>>> $ # start qemu/nemu multiple times
>>>>> $ killall perf
>>>>> $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
>>>>>
>>>>>>> As you can see, NEMU is faster to jump to the kernel
>>>>>>> (linux_start_kernel) than qemu-lite when uses qboot or seabios with
>>>>>>> virt support, but the time to the user space is strangely high, maybe
>>>>>>> the kernel configuration that I used is not the best one.
>>>>>>> Do you suggest another kernel configuration?
>>>>>>>
>>>>>> This looks very bad. This isn't the kernel configuration we normally
>>>>>> test with in our automated test system but is definitely one we support
>>>>>> as part of our partnernship with the Kata team. It's a high priority
>>>>>> for me to try and investigate that. Have you saved the kernel messages
>>>>>> as they might be helpful?
>>>>> Yes, I'm attaching the dmesg output with nemu and qemu.
>>>>>
>>>>>>> Anyway, I obtained the best boot time with qemu-lite and direct
>>>>>>> kernel
>>>>>>> load (vmlinux ELF image). I think because the kernel was not
>>>>>>> compressed. Indeed, looking to the others test, the kernel
>>>>>>> decompression (bzImage) takes about 80 ms (linux_start_kernel -
>>>>>>> linux_start_boot). (I'll investigate better)
>>>>>>>
>>>>>> Yup being able to load an uncompressed kernel is one of the big
>>>>>> advantages of qemu-lite. I wonder if we could bring that feature into
>>>>>> qemu itself to supplement the existing firmware based kernel loading.
>>>>> I think so, I'll try to understand if we can merge the qemu-lite
>>>>> direct kernel loading in qemu.
>>>> An attempt was made a long time ago to push the qemu-lite stuff (from the
>>>> Intel Clear Containers project) upstream. As I understand it, the main
>>>> stumbling block that seemed to derail the effort was that it involved adding
>>>> Linux OS specific code to Qemu so that Qemu could do things like create and
>>>> populate the zero page that Linux expects when entering startup_64().
>>>>
>>>> That ends up being a lot of very low-level, operating specific knowledge
>>>> about Linux that ends up getting baked into Qemu code. And understandably, a
>>>> number of folks saw problems with going down a path like that.
>>>>
>>>> Since then, we have put together an alternative solution that would allow
>>>> Qemu to boot an uncompressed Linux binary via the x86/HVM direct boot ABI
>>>> (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The solution involves
>>>> first making changes to both the ABI as well as Linux, and then updating
>>>> Qemu to take advantage of the updated ABI which is already supported by both
>>>> Linux and Free BSD for booting VMs. As such, Qemu can remain OS agnostic,
>>>> and just be programmed to the published ABI.
>>>>
>>>> The canonical definition for the HVM direct boot ABI is in the Xen tree and
>>>> we needed to make some minor changes to the ABI definition to allow KVM
>>>> guests to also use the same structure and entry point. Those changes were
>>>> accepted to the Xen tree already:
>>>> https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html
>>>>
>>>> The corresponding Linux changes that would allow KVM guests to be booted via
>>>> this PVH entry point have already been posted and reviewed:
>>>> https://lkml.org/lkml/2018/4/16/1002
>>>>
>>>> The final part is the set of Qemu changes to take advantage of the above and
>>>> boot a KVM guest via an uncompressed kernel binary using the entry point
>>>> defined by the ABI. Liam Merwick will be posting some RFC patches very soon
>>>> to allow this.
>>> Cool, thanks for doing this work!
>>>
>>> How do the boot times compare to qemu-lite and Firecracker's
>>> (https://github.com/firecracker-microvm/firecracker/) direct vmlinux ELF
>>> boot?
>> Boot times compare very favorably to qemu-lite, since the end result is
>> basically doing a very similar thing. For now, we are going with a QEMU +
>> qboot solution to introduce the PVH entry support in Qemu (meaning we will
>> be posting Qemu and qboot patches and you will need both to boot an
>> uncompressed kernel binary). As such we have numbers that Liam will include
>> in the cover letter showing significant boot time improvement over existing
>> QEMU + qboot approaches involving a compressed kernel binary. And as we all
>> know, the existing qboot approach already gets boot times down pretty low.
> The first email in this thread contains benchmark results showing that
> optimized SeaBIOS is comparable to qboot, so it does not offer anything
> unique with respect to boot time.

To be fair, what I'm saying is that the qboot + PVH approach saves a 
significant percentage of boot time as compared to qboot only. So it 
does provide an important improvement over both existing qboot as well 
as optimized SeaBIOS from what I can tell. Please see:

http://lists.nongnu.org/archive/html/qemu-devel/2018-12/msg00957.html
and
http://lists.nongnu.org/archive/html/qemu-devel/2018-12/msg00953.html

> We're trying to focus on SeaBIOS because it's actively maintained and
> already shipped by distros.  Relying on qboot will make it harder to get
> PVH into the hands of users because distros have to package and ship
> qboot first.  This might also require users to change their QEMU
> command-line syntax to benefit from fast kernel booting.

But you do make a good point here about distribution and usability. 
Using qboot is just one way to take advantage of the PVH entry -- and 
the quickest way for us to get something usable out there for the 
community to look at and play with.

There are other ways to take advantage of the PVH entry for KVM guests, 
once the Linux changes are in place. So qboot is definitely not a hard 
requirement in the long run.

Thanks,
-Maran

> I would strongly recommend looking at the SeaBIOS -kernel approach and
> avoiding QEMU command-line changes.  That way -kernel becomes fast for
> users as soon as they upgrade their QEMU without requiring configuration
> changes.
>
> If you have questions about how the -kernel boot works with SeaBIOS,
> Stefano can help explain it and share details of his
> development/benchmarking environment (see also earlier mails in this
> email thread).
>
>> Once the patches have been posted (soon) it would be great if some other
>> folks could pick them up and run your own numbers on various test setups and
>> comparisons you already have.
>>
>> I haven't tried Firecracker, specifically. It would be good to see a
>> comparison just so we know where we stand, but it's not terribly relevant to
>> folks who want to continue using Qemu right? Meaning Qemu (and all solutions
>> built on it like kata) still needs a solution for improving boot time
>> regardless of what NEMU and Firecracker are doing.
> Right.  Collaboration with Firecracker is more in the interest of
> avoiding duplication and making it easy for users to fast boot a single
> kernel (thanks to a common ABI like PVH) on any hypervisor.
>
> Stefan

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-06 14:47                   ` Maran Wilson
@ 2018-12-07 10:02                     ` Stefan Hajnoczi
  0 siblings, 0 replies; 18+ messages in thread
From: Stefan Hajnoczi @ 2018-12-07 10:02 UTC (permalink / raw)
  To: Maran Wilson; +Cc: Stefano Garzarella, qemu-devel, Rob Bradford, Samuel Ortiz

[-- Attachment #1: Type: text/plain, Size: 8223 bytes --]

On Thu, Dec 06, 2018 at 06:47:54AM -0800, Maran Wilson wrote:
> On 12/6/2018 2:38 AM, Stefan Hajnoczi wrote:
> > On Wed, Dec 05, 2018 at 10:04:36AM -0800, Maran Wilson wrote:
> > > On 12/5/2018 5:20 AM, Stefan Hajnoczi wrote:
> > > > On Tue, Dec 04, 2018 at 02:44:33PM -0800, Maran Wilson wrote:
> > > > > On 12/3/2018 8:35 AM, Stefano Garzarella wrote:
> > > > > > On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
> > > > > > > Hi Stefano, thanks for capturing all these numbers,
> > > > > > > 
> > > > > > > On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
> > > > > > > > Hi Rob,
> > > > > > > > I continued to investigate the boot time, and as you suggested I
> > > > > > > > looked also at qemu-lite 2.11.2
> > > > > > > > (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> > > > > > > > did the following tests using the Kata kernel configuration
> > > > > > > > (
> > > > > > > > https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
> > > > > > > > )
> > > > > > > > 
> > > > > > > > To compare the results with qemu-lite direct kernel load, I added
> > > > > > > > another tracepoint:
> > > > > > > > - linux_start_kernel: first entry of the Linux kernel
> > > > > > > > (start_kernel())
> > > > > > > > 
> > > > > > > Great, do you have a set of patches available that all these trace
> > > > > > > points. It would be great for reproduction.
> > > > > > For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
> > > > > > nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
> > > > > > I'm also sharing a python script that I'm using with perf to extract
> > > > > > the numbers in this way:
> > > > > > 
> > > > > > $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
> > > > > > sched:sched_process_exec -o /tmp/qemu_perf.data &
> > > > > > $ # start qemu/nemu multiple times
> > > > > > $ killall perf
> > > > > > $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
> > > > > > 
> > > > > > > > As you can see, NEMU is faster to jump to the kernel
> > > > > > > > (linux_start_kernel) than qemu-lite when uses qboot or seabios with
> > > > > > > > virt support, but the time to the user space is strangely high, maybe
> > > > > > > > the kernel configuration that I used is not the best one.
> > > > > > > > Do you suggest another kernel configuration?
> > > > > > > > 
> > > > > > > This looks very bad. This isn't the kernel configuration we normally
> > > > > > > test with in our automated test system but is definitely one we support
> > > > > > > as part of our partnernship with the Kata team. It's a high priority
> > > > > > > for me to try and investigate that. Have you saved the kernel messages
> > > > > > > as they might be helpful?
> > > > > > Yes, I'm attaching the dmesg output with nemu and qemu.
> > > > > > 
> > > > > > > > Anyway, I obtained the best boot time with qemu-lite and direct
> > > > > > > > kernel
> > > > > > > > load (vmlinux ELF image). I think because the kernel was not
> > > > > > > > compressed. Indeed, looking to the others test, the kernel
> > > > > > > > decompression (bzImage) takes about 80 ms (linux_start_kernel -
> > > > > > > > linux_start_boot). (I'll investigate better)
> > > > > > > > 
> > > > > > > Yup being able to load an uncompressed kernel is one of the big
> > > > > > > advantages of qemu-lite. I wonder if we could bring that feature into
> > > > > > > qemu itself to supplement the existing firmware based kernel loading.
> > > > > > I think so, I'll try to understand if we can merge the qemu-lite
> > > > > > direct kernel loading in qemu.
> > > > > An attempt was made a long time ago to push the qemu-lite stuff (from the
> > > > > Intel Clear Containers project) upstream. As I understand it, the main
> > > > > stumbling block that seemed to derail the effort was that it involved adding
> > > > > Linux OS specific code to Qemu so that Qemu could do things like create and
> > > > > populate the zero page that Linux expects when entering startup_64().
> > > > > 
> > > > > That ends up being a lot of very low-level, operating specific knowledge
> > > > > about Linux that ends up getting baked into Qemu code. And understandably, a
> > > > > number of folks saw problems with going down a path like that.
> > > > > 
> > > > > Since then, we have put together an alternative solution that would allow
> > > > > Qemu to boot an uncompressed Linux binary via the x86/HVM direct boot ABI
> > > > > (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The solution involves
> > > > > first making changes to both the ABI as well as Linux, and then updating
> > > > > Qemu to take advantage of the updated ABI which is already supported by both
> > > > > Linux and Free BSD for booting VMs. As such, Qemu can remain OS agnostic,
> > > > > and just be programmed to the published ABI.
> > > > > 
> > > > > The canonical definition for the HVM direct boot ABI is in the Xen tree and
> > > > > we needed to make some minor changes to the ABI definition to allow KVM
> > > > > guests to also use the same structure and entry point. Those changes were
> > > > > accepted to the Xen tree already:
> > > > > https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html
> > > > > 
> > > > > The corresponding Linux changes that would allow KVM guests to be booted via
> > > > > this PVH entry point have already been posted and reviewed:
> > > > > https://lkml.org/lkml/2018/4/16/1002
> > > > > 
> > > > > The final part is the set of Qemu changes to take advantage of the above and
> > > > > boot a KVM guest via an uncompressed kernel binary using the entry point
> > > > > defined by the ABI. Liam Merwick will be posting some RFC patches very soon
> > > > > to allow this.
> > > > Cool, thanks for doing this work!
> > > > 
> > > > How do the boot times compare to qemu-lite and Firecracker's
> > > > (https://github.com/firecracker-microvm/firecracker/) direct vmlinux ELF
> > > > boot?
> > > Boot times compare very favorably to qemu-lite, since the end result is
> > > basically doing a very similar thing. For now, we are going with a QEMU +
> > > qboot solution to introduce the PVH entry support in Qemu (meaning we will
> > > be posting Qemu and qboot patches and you will need both to boot an
> > > uncompressed kernel binary). As such we have numbers that Liam will include
> > > in the cover letter showing significant boot time improvement over existing
> > > QEMU + qboot approaches involving a compressed kernel binary. And as we all
> > > know, the existing qboot approach already gets boot times down pretty low.
> > The first email in this thread contains benchmark results showing that
> > optimized SeaBIOS is comparable to qboot, so it does not offer anything
> > unique with respect to boot time.
> 
> To be fair, what I'm saying is that the qboot + PVH approach saves a
> significant percentage of boot time as compared to qboot only. So it does
> provide an important improvement over both existing qboot as well as
> optimized SeaBIOS from what I can tell. Please see:
> 
> http://lists.nongnu.org/archive/html/qemu-devel/2018-12/msg00957.html
> and
> http://lists.nongnu.org/archive/html/qemu-devel/2018-12/msg00953.html
> 
> > We're trying to focus on SeaBIOS because it's actively maintained and
> > already shipped by distros.  Relying on qboot will make it harder to get
> > PVH into the hands of users because distros have to package and ship
> > qboot first.  This might also require users to change their QEMU
> > command-line syntax to benefit from fast kernel booting.
> 
> But you do make a good point here about distribution and usability. Using
> qboot is just one way to take advantage of the PVH entry -- and the quickest
> way for us to get something usable out there for the community to look at
> and play with.
> 
> There are other ways to take advantage of the PVH entry for KVM guests, once
> the Linux changes are in place. So qboot is definitely not a hard
> requirement in the long run.

Great, good to hear!

Stefan

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-05 18:04               ` Maran Wilson
  2018-12-06 10:38                 ` Stefan Hajnoczi
@ 2018-12-10 13:46                 ` Stefano Garzarella
  1 sibling, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2018-12-10 13:46 UTC (permalink / raw)
  To: maran.wilson
  Cc: Stefan Hajnoczi, qemu-devel, Samuel Ortiz, Rob Bradford, liam.merwick

Hi Maran,

On Wed, Dec 5, 2018 at 7:04 PM Maran Wilson <maran.wilson@oracle.com> wrote:
>
> On 12/5/2018 5:20 AM, Stefan Hajnoczi wrote:
> > On Tue, Dec 04, 2018 at 02:44:33PM -0800, Maran Wilson wrote:
> >> On 12/3/2018 8:35 AM, Stefano Garzarella wrote:
> >>> On Mon, Dec 3, 2018 at 4:44 PM Rob Bradford <robert.bradford@intel.com> wrote:
> >>>> Hi Stefano, thanks for capturing all these numbers,
> >>>>
> >>>> On Mon, 2018-12-03 at 15:27 +0100, Stefano Garzarella wrote:
> >>>>> Hi Rob,
> >>>>> I continued to investigate the boot time, and as you suggested I
> >>>>> looked also at qemu-lite 2.11.2
> >>>>> (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> >>>>> did the following tests using the Kata kernel configuration
> >>>>> (
> >>>>> https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x
> >>>>> )
> >>>>>
> >>>>> To compare the results with qemu-lite direct kernel load, I added
> >>>>> another tracepoint:
> >>>>> - linux_start_kernel: first entry of the Linux kernel
> >>>>> (start_kernel())
> >>>>>
> >>>> Great, do you have a set of patches available that all these trace
> >>>> points. It would be great for reproduction.
> >>> For sure! I'm attaching a set of patches for qboot, seabios, ovmf,
> >>> nemu/qemu/qemu-lite and linux 4.14 whit the tracepoints.
> >>> I'm also sharing a python script that I'm using with perf to extract
> >>> the numbers in this way:
> >>>
> >>> $ perf record -a -e kvm:kvm_entry -e kvm:kvm_pio -e
> >>> sched:sched_process_exec -o /tmp/qemu_perf.data &
> >>> $ # start qemu/nemu multiple times
> >>> $ killall perf
> >>> $ perf script -s qemu-perf-script.py -i /tmp/qemu_perf.data
> >>>
> >>>>> As you can see, NEMU is faster to jump to the kernel
> >>>>> (linux_start_kernel) than qemu-lite when uses qboot or seabios with
> >>>>> virt support, but the time to the user space is strangely high, maybe
> >>>>> the kernel configuration that I used is not the best one.
> >>>>> Do you suggest another kernel configuration?
> >>>>>
> >>>> This looks very bad. This isn't the kernel configuration we normally
> >>>> test with in our automated test system but is definitely one we support
> >>>> as part of our partnernship with the Kata team. It's a high priority
> >>>> for me to try and investigate that. Have you saved the kernel messages
> >>>> as they might be helpful?
> >>> Yes, I'm attaching the dmesg output with nemu and qemu.
> >>>
> >>>>> Anyway, I obtained the best boot time with qemu-lite and direct
> >>>>> kernel
> >>>>> load (vmlinux ELF image). I think because the kernel was not
> >>>>> compressed. Indeed, looking to the others test, the kernel
> >>>>> decompression (bzImage) takes about 80 ms (linux_start_kernel -
> >>>>> linux_start_boot). (I'll investigate better)
> >>>>>
> >>>> Yup being able to load an uncompressed kernel is one of the big
> >>>> advantages of qemu-lite. I wonder if we could bring that feature into
> >>>> qemu itself to supplement the existing firmware based kernel loading.
> >>> I think so, I'll try to understand if we can merge the qemu-lite
> >>> direct kernel loading in qemu.
> >> An attempt was made a long time ago to push the qemu-lite stuff (from the
> >> Intel Clear Containers project) upstream. As I understand it, the main
> >> stumbling block that seemed to derail the effort was that it involved adding
> >> Linux OS specific code to Qemu so that Qemu could do things like create and
> >> populate the zero page that Linux expects when entering startup_64().
> >>
> >> That ends up being a lot of very low-level, operating specific knowledge
> >> about Linux that ends up getting baked into Qemu code. And understandably, a
> >> number of folks saw problems with going down a path like that.
> >>
> >> Since then, we have put together an alternative solution that would allow
> >> Qemu to boot an uncompressed Linux binary via the x86/HVM direct boot ABI
> >> (https://xenbits.xen.org/docs/unstable/misc/pvh.html). The solution involves
> >> first making changes to both the ABI as well as Linux, and then updating
> >> Qemu to take advantage of the updated ABI which is already supported by both
> >> Linux and Free BSD for booting VMs. As such, Qemu can remain OS agnostic,
> >> and just be programmed to the published ABI.
> >>
> >> The canonical definition for the HVM direct boot ABI is in the Xen tree and
> >> we needed to make some minor changes to the ABI definition to allow KVM
> >> guests to also use the same structure and entry point. Those changes were
> >> accepted to the Xen tree already:
> >> https://lists.xenproject.org/archives/html/xen-devel/2018-04/msg00057.html
> >>
> >> The corresponding Linux changes that would allow KVM guests to be booted via
> >> this PVH entry point have already been posted and reviewed:
> >> https://lkml.org/lkml/2018/4/16/1002
> >>
> >> The final part is the set of Qemu changes to take advantage of the above and
> >> boot a KVM guest via an uncompressed kernel binary using the entry point
> >> defined by the ABI. Liam Merwick will be posting some RFC patches very soon
> >> to allow this.
> > Cool, thanks for doing this work!
> >
> > How do the boot times compare to qemu-lite and Firecracker's
> > (https://github.com/firecracker-microvm/firecracker/) direct vmlinux ELF
> > boot?
>
> Boot times compare very favorably to qemu-lite, since the end result is
> basically doing a very similar thing. For now, we are going with a QEMU
> + qboot solution to introduce the PVH entry support in Qemu (meaning we
> will be posting Qemu and qboot patches and you will need both to boot an
> uncompressed kernel binary). As such we have numbers that Liam will
> include in the cover letter showing significant boot time improvement
> over existing QEMU + qboot approaches involving a compressed kernel
> binary. And as we all know, the existing qboot approach already gets
> boot times down pretty low.
>
> Once the patches have been posted (soon) it would be great if some other
> folks could pick them up and run your own numbers on various test setups
> and comparisons you already have.
>
> I haven't tried Firecracker, specifically. It would be good to see a
> comparison just so we know where we stand, but it's not terribly
> relevant to folks who want to continue using Qemu right? Meaning Qemu
> (and all solutions built on it like kata) still needs a solution for
> improving boot time regardless of what NEMU and Firecracker are doing.
>
> And from what I've read so far, Firecracker only supports Linux guests.
> So one could arguably just bake in all sorts of Linux specific knowledge
> into it and have it lay things out like zero page right in the VMM code
> right?

Yes, you are right!

>
> I don't know off-hand, but is that how Firecracker boots an uncompressed
> Linux kernel? Anyone know?

I'm looking in Firecracker and they use the same approach of qemu-lite
to load the Linux kernel:
1. load ELF image (vmlinux)
2. setup zero page in VMM code (eg. command line)
3. setup VM registers  (eg. ESI = zero page address, EIP = ELF entry_point, etc)
4. start VM (ELF entry_point = phys_startup_64)

Cheers,
Stefano

>
> Thanks,
> -Maran
>
> > I'm asking because there are several custom approaches to fast kernel
> > boot and we should make sure that whatever Linux and QEMU end up
> > natively supporting is likely to work for all projects (NEMU, qemu-lite,
> > Firecracker) and operating systems (Linux distros, other OSes).
> >
> > Stefan
>


-- 
Stefano Garzarella
Red Hat

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares
  2018-12-03 14:27     ` Stefano Garzarella
  2018-12-03 15:44       ` Rob Bradford
@ 2018-12-13 11:19       ` Stefano Garzarella
  1 sibling, 0 replies; 18+ messages in thread
From: Stefano Garzarella @ 2018-12-13 11:19 UTC (permalink / raw)
  To: qemu-devel
  Cc: Samuel Ortiz, Rob Bradford, Stefan Hajnoczi, liam.merwick,
	maran.wilson, Philippe Mathieu Daude

Hi all,
this is an update of QEMU/NEMU booting time.

After the preliminary results, Rob found the reason of NEMU delays:
they were related to the PIT and the PCI initialization. (He sent me a
temporary hack to fix the issues)
Afterward, he discovered that there is a huge difference between the
4.14 and 4.19 Linux Kernel. (The early TSC related code that uses the
PIT appears to have disappeared.)

For this reason, I did the same tests (without OVMF) using the Linux
kernel 4.19:
* QEMU "pc" machine
  + qboot
    qemu_init_end: 40.357177
    fw_start: 40.533092 (+0.175915)
    fw_do_boot: 45.713896 (+5.180804)
    linux_start_boot: 47.451011 (+1.737115)
    linux_start_kernel: 131.372861 (+83.92185)
    linux_start_user: 279.818064 (+148.445203)
  + seabios
    qemu_init_end: 40.551679
    fw_start: 40.757828 (+0.206149)
    fw_do_boot: 52.688499 (+11.930671)
    linux_start_boot: 53.984453 (+1.295954)
    linux_start_kernel: 136.481648 (+82.497195)
    linux_start_user: 289.62459 (+153.142942)

* NEMU "virt" machine
  + qboot
    qemu_init_end: 29.870696
    fw_start: 30.092108 (+0.221412)
    fw_do_boot: 31.514079 (+1.421971)
    linux_start_boot: 32.9857 (+1.471621)
    linux_start_kernel: 116.546745 (+83.561045)
    linux_start_user: 241.254154 (+124.707409)
  + seabios
    qemu_init_end: 29.940350
    fw_start: 30.128961 (+0.188611)
    fw_do_boot: 51.469139 (+21.340178)
    linux_start_boot: 52.743141 (+1.274002)
    linux_start_kernel: 135.128856 (+82.385715)
    linux_start_user: 259.859471 (+124.730615)

* Qemu-lite
  + no FW (-machine pc,...,fw=off KERNEL=/path/to/vmlinux)
    qemu_init_end: 32.415627
    linux_start_kernel: 32.692544 (+0.276917)
    linux_start_user: 163.238889 (+130.546345)
  + qboot
    qemu_init_end: 25.840606
    fw_start: 25.990774 (+0.150168)
    fw_do_boot: 31.400927 (+5.410153)
    linux_start_boot: 33.084469 (+1.683542)
    linux_start_kernel: 115.441407 (+82.356938)
    linux_start_user: 249.881718 (+134.440311)
  + seabios
    qemu_init_end: 26.677585
    fw_start: 26.900763 (+0.223178)
    fw_do_boot: 39.478128 (+12.577365)
    linux_start_boot: 40.870536 (+1.392408)
    linux_start_kernel: 123.600073 (+82.729537)
    linux_start_user: 264.844896 (+141.244823)

To have another reference point, I tried also the QEMU q35 machine.
disabling some buses and devices (-machine
q35,...,sata=off,smbus=off,vmport=off -no-hpet), obtaining very good
results in term of boot time:
* QEMU "q35" machine
  + qboot
   qemu_init_end: 36.667717
   fw_start: 36.847075 (+0.179358)
   fw_do_boot: 41.794749 (+4.947674)
   linux_start_boot: 43.445767 (+1.651018)
   linux_start_kernel: 126.341309 (+82.895542)
   linux_start_user: 255.269846 (+128.928537)
 + seabios
   qemu_init_end: 37.781986
   fw_start: 37.974142 (+0.192156)
   fw_do_boot: 49.079861 (+11.105719)
   linux_start_boot: 50.467159 (+1.387298)
   linux_start_kernel: 133.292707 (+82.825548)
   linux_start_user: 261.492556 (+128.199849)

NEMU is the faster, but also QEMU with q35 machine is not bad.
In the next days, I'll do other tests with Xen PVH and also
Firecracker (they use the same approach of qemu-lite "-machine
pc,...,fw=off").

Note: we added "pci=lastbus=0" to the cmdline to speed up the PCI
initialization.

Thanks,
Stefano

On Mon, Dec 3, 2018 at 3:27 PM Stefano Garzarella <sgarzare@redhat.com> wrote:
>
> Hi Rob,
> I continued to investigate the boot time, and as you suggested I
> looked also at qemu-lite 2.11.2
> (https://github.com/kata-containers/qemu) and NEMU "virt" machine. I
> did the following tests using the Kata kernel configuration
> (https://github.com/kata-containers/packaging/blob/master/kernel/configs/x86_64_kata_kvm_4.14.x)
>
> To compare the results with qemu-lite direct kernel load, I added
> another tracepoint:
> - linux_start_kernel: first entry of the Linux kernel (start_kernel())
>
> As you can see, NEMU is faster to jump to the kernel
> (linux_start_kernel) than qemu-lite when uses qboot or seabios with
> virt support, but the time to the user space is strangely high, maybe
> the kernel configuration that I used is not the best one.
> Do you suggest another kernel configuration?
>
> Anyway, I obtained the best boot time with qemu-lite and direct kernel
> load (vmlinux ELF image). I think because the kernel was not
> compressed. Indeed, looking to the others test, the kernel
> decompression (bzImage) takes about 80 ms (linux_start_kernel -
> linux_start_boot). (I'll investigate better)
>
> * QEMU
>     + qboot
>       qemu_init_end: 41.306968
>       fw_start: 41.478601 (+0.171633)
>       fw_do_boot: 47.03688 (+5.558279)
>       linux_start_boot: 48.80918 (+1.7723)
>       linux_start_kernel: 128.474013 (+79.664833)
>       linux_start_user: 347.257097 (+218.783084)
>     + seabios
>       qemu_init_end: 40.703177
>       fw_start: 40.902842 (+0.199665)
>       fw_do_boot: 54.604878 (+13.702036)
>       linux_start_boot: 56.314063 (+1.709185)
>       linux_start_kernel: 136.942521 (+80.628458)
>       linux_start_user: 356.211221 (+219.2687)
>     + OVMF
>       qemu_init_end: 44.055712
>       fw_start: 168.094131 (+124.038419)
>       fw_do_boot: 322.222745 (+154.128614)
>       linux_start_boot: 323.809639 (+1.586894)
>       linux_start_kernel: 401.756773 (+77.947134)
>       linux_start_user: 604.601438 (+202.844665)
> * NEMU "virt" machine
>     + qboot
>       qemu_init_end: 27.732925
>       fw_start: 27.887483 (+0.154558)
>       fw_do_boot: 29.20226 (+1.314777)
>       linux_start_boot: 30.842672 (+1.640412)
>       linux_start_kernel: 111.639445 (+80.796773)
>       linux_start_user: 862.294876 (+750.655431)
>     + seabios
>       qemu_init_end: 28.203907
>       fw_start: 28.68285 (+0.478943)
>       fw_do_boot: 36.337377 (+7.654527)
>       linux_start_boot: 37.995622 (+1.658245)
>       linux_start_kernel: 118.11591 (+80.120288)
>       linux_start_user: 897.110477 (+778.994567)
>     + OVMF
>       qemu_init_end: 31.219852
>       fw_start: 153.304126 (+122.084274)
>       fw_do_boot: 183.69459 (+30.390464)
>       linux_start_boot: 185.296567 (+1.601977)
>       linux_start_kernel: 361.11524 (+175.818673)
>       linux_start_user: 1231.590077 (+870.474837)
> * Qemu-lite
>     + no FW (-machine pc,...,fw=off KERNEL=/path/to/vmlinux)
>       qemu_init_end: 32.243801
>       linux_start_kernel: 32.330323 (+0.086522)
>       linux_start_user: 208.742712 (+176.412389)
>     + qboot
>       qemu_init_end: 25.393047
>       fw_start: 25.545285 (+0.152238)
>       fw_do_boot: 31.073253 (+5.527968)
>       linux_start_boot: 33.139721 (+2.066468)
>       linux_start_kernel: 114.604384 (+81.464663)
>       linux_start_user: 287.93128 (+173.326896)
>     + seabios
>       qemu_init_end: 25.448283
>       fw_start: 25.628977 (+0.180694)
>       fw_do_boot: 38.551957 (+12.92298)
>       linux_start_boot: 40.220319 (+1.668362)
>       linux_start_kernel: 126.392991 (+86.172672)
>       linux_start_user: 308.807806 (+182.414815)
>     + OVMF
>       qemu_init_end: 27.616988
>       fw_start: 151.272286 (+123.655298)
>       fw_do_boot: 309.068926 (+157.79664)
>       linux_start_boot: 310.602074 (+1.533148)
>       linux_start_kernel: 387.83349 (+77.231416)
>       linux_start_user: 576.057453 (+188.223963)
>
> Testbed:
> * seabios: https://github.com/rbradford/seabios/tree/virt-x86 + patch
> to disable debug writes
> (https://mail.coreboot.org/pipermail/seabios/2018-December/012654.html)
> * qboot: https://github.com/rbradford/qboot/tree/virt-x86
> * OVMF: https://github.com/intel/ovmf-virt/tree/virt-x86
> * QEMU command: $QEMU -bios $BIOS -machine
> $MACHINE,accel=kvm,kernel_irqchip,nvdimm -cpu host -m
> 512,maxmem=1G,slots=1 -smp 1 -kernel $KERNEL -append 'root=/dev/pmem0
> ro console=hvc0' -vga none -display none -no-user-config -nodefaults
> -object memory-backend-file,id=mem0,share,mem-path=$ROOTFS,size=$ROOTFS_SIZE
> -device nvdimm,memdev=mem0,id=nv0 -chardev stdio,id=virtiocon0 -device
> virtio-serial -device virtconsole,chardev=virtiocon0
>
> Cheers,
> Stefano
>

-- 
Stefano Garzarella
Red Hat

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2018-12-13 11:20 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-11-26 16:40 [Qemu-devel] QEMU/NEMU boot time with several x86 firmwares Stefano Garzarella
2018-11-27  9:57 ` Rob Bradford
2018-11-27 14:21   ` Stefano Garzarella
2018-12-03 14:27     ` Stefano Garzarella
2018-12-03 15:44       ` Rob Bradford
2018-12-03 16:35         ` Stefano Garzarella
2018-12-04 22:44           ` Maran Wilson
2018-12-05 12:06             ` Stefano Garzarella
2018-12-05 13:20             ` Stefan Hajnoczi
2018-12-05 14:19               ` Boris Ostrovsky
2018-12-05 18:04               ` Maran Wilson
2018-12-06 10:38                 ` Stefan Hajnoczi
2018-12-06 14:47                   ` Maran Wilson
2018-12-07 10:02                     ` Stefan Hajnoczi
2018-12-10 13:46                 ` Stefano Garzarella
2018-12-05 12:26           ` Philippe Mathieu-Daudé
2018-12-05 16:23             ` Stefano Garzarella
2018-12-13 11:19       ` Stefano Garzarella

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.