* [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type @ 2019-07-02 12:11 Sergio Lopez 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 1/4] hw/virtio: Factorize virtio-mmio headers Sergio Lopez ` (8 more replies) 0 siblings, 9 replies; 68+ messages in thread From: Sergio Lopez @ 2019-07-02 12:11 UTC (permalink / raw) To: mst, marcel.apfelbaum, pbonzini, rth, ehabkost, maran.wilson, sgarzare, kraxel Cc: qemu-devel, Sergio Lopez Microvm is a machine type inspired by both NEMU and Firecracker, and constructed after the machine model implemented by the latter. It's main purpose is providing users a KVM-only machine type with fast boot times, minimal attack surface (measured as the number of IO ports and MMIO regions exposed to the Guest) and small footprint (specially when combined with the ongoing QEMU modularization effort). Normally, other than the device support provided by KVM itself, microvm only supports virtio-mmio devices. Microvm also includes a legacy mode, which adds an ISA bus with a 16550A serial port, useful for being able to see the early boot kernel messages. Microvm only supports booting PVH-enabled Linux ELF images. Booting other PVH-enabled kernels may be possible, but due to the lack of ACPI and firmware, we're relying on the command line for specifying the location of the virtio-mmio transports. If there's an interest on using this machine type with other kernels, we'll try to find some kind of middle ground solution. This is the list of the exposed IO ports and MMIO regions when running in non-legacy mode: address-space: memory 00000000d0000000-00000000d00001ff (prio 0, i/o): virtio-mmio 00000000d0000200-00000000d00003ff (prio 0, i/o): virtio-mmio 00000000d0000400-00000000d00005ff (prio 0, i/o): virtio-mmio 00000000d0000600-00000000d00007ff (prio 0, i/o): virtio-mmio 00000000d0000800-00000000d00009ff (prio 0, i/o): virtio-mmio 00000000d0000a00-00000000d0000bff (prio 0, i/o): virtio-mmio 00000000d0000c00-00000000d0000dff (prio 0, i/o): virtio-mmio 00000000d0000e00-00000000d0000fff (prio 0, i/o): virtio-mmio 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi address-space: I/O 0000000000000000-000000000000ffff (prio 0, i/o): io 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic 0000000000000040-0000000000000043 (prio 0, i/o): kvm-pit 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr A QEMU instance with the microvm machine type can be invoked this way: - Normal mode: qemu-system-x86_64 -M microvm -m 512m -smp 2 \ -kernel vmlinux -append "console=hvc0 root=/dev/vda" \ -nodefaults -no-user-config \ -chardev pty,id=virtiocon0,server \ -device virtio-serial-device \ -device virtconsole,chardev=virtiocon0 \ -drive id=test,file=test.img,format=raw,if=none \ -device virtio-blk-device,drive=test \ -netdev tap,id=tap0,script=no,downscript=no \ -device virtio-net-device,netdev=tap0 - Legacy mode: qemu-system-x86_64 -M microvm,legacy -m 512m -smp 2 \ -kernel vmlinux -append "console=ttyS0 root=/dev/vda" \ -nodefaults -no-user-config \ -drive id=test,file=test.img,format=raw,if=none \ -device virtio-blk-device,drive=test \ -netdev tap,id=tap0,script=no,downscript=no \ -device virtio-net-device,netdev=tap0 \ -serial stdio Changelog: v3: - Add initrd support (thanks Stefano). v2: - Drop "[PATCH 1/4] hw/i386: Factorize CPU routine". - Simplify machine definition (thanks Eduardo). - Remove use of unneeded NUMA-related callbacks (thanks Eduardo). - Add a patch to factorize PVH-related functions. - Replace use of Linux's Zero Page with PVH (thanks Maran and Paolo). Sergio Lopez (4): hw/virtio: Factorize virtio-mmio headers hw/i386: Add an Intel MPTable generator hw/i386: Factorize PVH related functions hw/i386: Introduce the microvm machine type default-configs/i386-softmmu.mak | 1 + hw/i386/Kconfig | 4 + hw/i386/Makefile.objs | 2 + hw/i386/microvm.c | 550 ++++++++++++++++++++ hw/i386/mptable.c | 156 ++++++ hw/i386/pc.c | 120 +---- hw/i386/pvh.c | 113 ++++ hw/i386/pvh.h | 10 + hw/virtio/virtio-mmio.c | 35 +- hw/virtio/virtio-mmio.h | 60 +++ include/hw/i386/microvm.h | 82 +++ include/hw/i386/mptable.h | 36 ++ include/standard-headers/linux/mpspec_def.h | 182 +++++++ 13 files changed, 1209 insertions(+), 142 deletions(-) create mode 100644 hw/i386/microvm.c create mode 100644 hw/i386/mptable.c create mode 100644 hw/i386/pvh.c create mode 100644 hw/i386/pvh.h create mode 100644 hw/virtio/virtio-mmio.h create mode 100644 include/hw/i386/microvm.h create mode 100644 include/hw/i386/mptable.h create mode 100644 include/standard-headers/linux/mpspec_def.h -- 2.21.0 ^ permalink raw reply [flat|nested] 68+ messages in thread
* [Qemu-devel] [PATCH v3 1/4] hw/virtio: Factorize virtio-mmio headers 2019-07-02 12:11 [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Sergio Lopez @ 2019-07-02 12:11 ` Sergio Lopez 2019-07-25 9:46 ` Liam Merwick 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 2/4] hw/i386: Add an Intel MPTable generator Sergio Lopez ` (7 subsequent siblings) 8 siblings, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-07-02 12:11 UTC (permalink / raw) To: mst, marcel.apfelbaum, pbonzini, rth, ehabkost, maran.wilson, sgarzare, kraxel Cc: qemu-devel, Sergio Lopez Put QOM and main struct definition in a separate header file, so it can be accesed from other components. This is needed for the microvm machine type implementation. Signed-off-by: Sergio Lopez <slp@redhat.com> --- hw/virtio/virtio-mmio.c | 35 +----------------------- hw/virtio/virtio-mmio.h | 60 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 61 insertions(+), 34 deletions(-) create mode 100644 hw/virtio/virtio-mmio.h diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c index 97b7f35496..87c7fe4d8d 100644 --- a/hw/virtio/virtio-mmio.c +++ b/hw/virtio/virtio-mmio.c @@ -26,44 +26,11 @@ #include "qemu/host-utils.h" #include "qemu/module.h" #include "sysemu/kvm.h" -#include "hw/virtio/virtio-bus.h" +#include "virtio-mmio.h" #include "qemu/error-report.h" #include "qemu/log.h" #include "trace.h" -/* QOM macros */ -/* virtio-mmio-bus */ -#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus" -#define VIRTIO_MMIO_BUS(obj) \ - OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS) -#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \ - OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS) -#define VIRTIO_MMIO_BUS_CLASS(klass) \ - OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS) - -/* virtio-mmio */ -#define TYPE_VIRTIO_MMIO "virtio-mmio" -#define VIRTIO_MMIO(obj) \ - OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO) - -#define VIRT_MAGIC 0x74726976 /* 'virt' */ -#define VIRT_VERSION 1 -#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */ - -typedef struct { - /* Generic */ - SysBusDevice parent_obj; - MemoryRegion iomem; - qemu_irq irq; - /* Guest accessible state needing migration and reset */ - uint32_t host_features_sel; - uint32_t guest_features_sel; - uint32_t guest_page_shift; - /* virtio-bus */ - VirtioBusState bus; - bool format_transport_address; -} VirtIOMMIOProxy; - static bool virtio_mmio_ioeventfd_enabled(DeviceState *d) { return kvm_eventfds_enabled(); diff --git a/hw/virtio/virtio-mmio.h b/hw/virtio/virtio-mmio.h new file mode 100644 index 0000000000..2f3973f8c7 --- /dev/null +++ b/hw/virtio/virtio-mmio.h @@ -0,0 +1,60 @@ +/* + * Virtio MMIO bindings + * + * Copyright (c) 2011 Linaro Limited + * + * Author: + * Peter Maydell <peter.maydell@linaro.org> + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License; either version 2 + * of the License, or (at your option) any later version. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License along + * with this program; if not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef QEMU_VIRTIO_MMIO_H +#define QEMU_VIRTIO_MMIO_H + +#include "hw/virtio/virtio-bus.h" + +/* QOM macros */ +/* virtio-mmio-bus */ +#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus" +#define VIRTIO_MMIO_BUS(obj) \ + OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS) +#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \ + OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS) +#define VIRTIO_MMIO_BUS_CLASS(klass) \ + OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS) + +/* virtio-mmio */ +#define TYPE_VIRTIO_MMIO "virtio-mmio" +#define VIRTIO_MMIO(obj) \ + OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO) + +#define VIRT_MAGIC 0x74726976 /* 'virt' */ +#define VIRT_VERSION 1 +#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */ + +typedef struct { + /* Generic */ + SysBusDevice parent_obj; + MemoryRegion iomem; + qemu_irq irq; + /* Guest accessible state needing migration and reset */ + uint32_t host_features_sel; + uint32_t guest_features_sel; + uint32_t guest_page_shift; + /* virtio-bus */ + VirtioBusState bus; + bool format_transport_address; +} VirtIOMMIOProxy; + +#endif -- 2.21.0 ^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/4] hw/virtio: Factorize virtio-mmio headers 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 1/4] hw/virtio: Factorize virtio-mmio headers Sergio Lopez @ 2019-07-25 9:46 ` Liam Merwick 2019-07-25 9:58 ` Michael S. Tsirkin 0 siblings, 1 reply; 68+ messages in thread From: Liam Merwick @ 2019-07-25 9:46 UTC (permalink / raw) To: Sergio Lopez, mst, marcel.apfelbaum, pbonzini, rth, ehabkost, maran.wilson, sgarzare, kraxel Cc: qemu-devel On 02/07/2019 13:11, Sergio Lopez wrote: > Put QOM and main struct definition in a separate header file, so it > can be accesed from other components. typo: accesed -> accessed > > This is needed for the microvm machine type implementation. > > Signed-off-by: Sergio Lopez <slp@redhat.com> One nit below, either way Reviewed-by: Liam Merwick <liam.merwick@oracle.com> > --- > hw/virtio/virtio-mmio.c | 35 +----------------------- > hw/virtio/virtio-mmio.h | 60 +++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 61 insertions(+), 34 deletions(-) > create mode 100644 hw/virtio/virtio-mmio.h > > diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c > index 97b7f35496..87c7fe4d8d 100644 > --- a/hw/virtio/virtio-mmio.c > +++ b/hw/virtio/virtio-mmio.c > @@ -26,44 +26,11 @@ > #include "qemu/host-utils.h" > #include "qemu/module.h" > #include "sysemu/kvm.h" > -#include "hw/virtio/virtio-bus.h" > +#include "virtio-mmio.h" Virtually all the other includes of virtio-xxx.h files in hw/virtio use the full path - e.g. "hw/virtio/virtio-mmio.h" - maybe do the same to be consistent. > #include "qemu/error-report.h" > #include "qemu/log.h" > #include "trace.h" > > -/* QOM macros */ > -/* virtio-mmio-bus */ > -#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus" > -#define VIRTIO_MMIO_BUS(obj) \ > - OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS) > -#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \ > - OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS) > -#define VIRTIO_MMIO_BUS_CLASS(klass) \ > - OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS) > - > -/* virtio-mmio */ > -#define TYPE_VIRTIO_MMIO "virtio-mmio" > -#define VIRTIO_MMIO(obj) \ > - OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO) > - > -#define VIRT_MAGIC 0x74726976 /* 'virt' */ > -#define VIRT_VERSION 1 > -#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */ > - > -typedef struct { > - /* Generic */ > - SysBusDevice parent_obj; > - MemoryRegion iomem; > - qemu_irq irq; > - /* Guest accessible state needing migration and reset */ > - uint32_t host_features_sel; > - uint32_t guest_features_sel; > - uint32_t guest_page_shift; > - /* virtio-bus */ > - VirtioBusState bus; > - bool format_transport_address; > -} VirtIOMMIOProxy; > - > static bool virtio_mmio_ioeventfd_enabled(DeviceState *d) > { > return kvm_eventfds_enabled(); > diff --git a/hw/virtio/virtio-mmio.h b/hw/virtio/virtio-mmio.h > new file mode 100644 > index 0000000000..2f3973f8c7 > --- /dev/null > +++ b/hw/virtio/virtio-mmio.h > @@ -0,0 +1,60 @@ > +/* > + * Virtio MMIO bindings > + * > + * Copyright (c) 2011 Linaro Limited > + * > + * Author: > + * Peter Maydell <peter.maydell@linaro.org> > + * > + * This program is free software; you can redistribute it and/or modify > + * it under the terms of the GNU General Public License; either version 2 > + * of the License, or (at your option) any later version. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > + * GNU General Public License for more details. > + * > + * You should have received a copy of the GNU General Public License along > + * with this program; if not, see <http://www.gnu.org/licenses/>. > + */ > + > +#ifndef QEMU_VIRTIO_MMIO_H > +#define QEMU_VIRTIO_MMIO_H > + > +#include "hw/virtio/virtio-bus.h" > + > +/* QOM macros */ > +/* virtio-mmio-bus */ > +#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus" > +#define VIRTIO_MMIO_BUS(obj) \ > + OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS) > +#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \ > + OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS) > +#define VIRTIO_MMIO_BUS_CLASS(klass) \ > + OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS) > + > +/* virtio-mmio */ > +#define TYPE_VIRTIO_MMIO "virtio-mmio" > +#define VIRTIO_MMIO(obj) \ > + OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO) > + > +#define VIRT_MAGIC 0x74726976 /* 'virt' */ > +#define VIRT_VERSION 1 > +#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */ > + > +typedef struct { > + /* Generic */ > + SysBusDevice parent_obj; > + MemoryRegion iomem; > + qemu_irq irq; > + /* Guest accessible state needing migration and reset */ > + uint32_t host_features_sel; > + uint32_t guest_features_sel; > + uint32_t guest_page_shift; > + /* virtio-bus */ > + VirtioBusState bus; > + bool format_transport_address; > +} VirtIOMMIOProxy; > + > +#endif > ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/4] hw/virtio: Factorize virtio-mmio headers 2019-07-25 9:46 ` Liam Merwick @ 2019-07-25 9:58 ` Michael S. Tsirkin 2019-07-25 10:03 ` Peter Maydell 2019-07-25 10:36 ` Paolo Bonzini 0 siblings, 2 replies; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 9:58 UTC (permalink / raw) To: Liam Merwick Cc: ehabkost, Sergio Lopez, maran.wilson, qemu-devel, kraxel, pbonzini, sgarzare, rth On Thu, Jul 25, 2019 at 10:46:00AM +0100, Liam Merwick wrote: > On 02/07/2019 13:11, Sergio Lopez wrote: > > Put QOM and main struct definition in a separate header file, so it > > can be accesed from other components. > > typo: accesed -> accessed > > > > > This is needed for the microvm machine type implementation. > > > > Signed-off-by: Sergio Lopez <slp@redhat.com> > > One nit below, either way > > Reviewed-by: Liam Merwick <liam.merwick@oracle.com> > > > --- > > hw/virtio/virtio-mmio.c | 35 +----------------------- > > hw/virtio/virtio-mmio.h | 60 +++++++++++++++++++++++++++++++++++++++++ > > 2 files changed, 61 insertions(+), 34 deletions(-) > > create mode 100644 hw/virtio/virtio-mmio.h > > > > diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c > > index 97b7f35496..87c7fe4d8d 100644 > > --- a/hw/virtio/virtio-mmio.c > > +++ b/hw/virtio/virtio-mmio.c > > @@ -26,44 +26,11 @@ > > #include "qemu/host-utils.h" > > #include "qemu/module.h" > > #include "sysemu/kvm.h" > > -#include "hw/virtio/virtio-bus.h" > > +#include "virtio-mmio.h" > > > Virtually all the other includes of virtio-xxx.h files in hw/virtio use the > full path - e.g. "hw/virtio/virtio-mmio.h" - maybe do the same to be > consistent. That's for headers under include/. Local ones are ok with a short name. > > > #include "qemu/error-report.h" > > #include "qemu/log.h" > > #include "trace.h" > > -/* QOM macros */ > > -/* virtio-mmio-bus */ > > -#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus" > > -#define VIRTIO_MMIO_BUS(obj) \ > > - OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS) > > -#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \ > > - OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS) > > -#define VIRTIO_MMIO_BUS_CLASS(klass) \ > > - OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS) > > - > > -/* virtio-mmio */ > > -#define TYPE_VIRTIO_MMIO "virtio-mmio" > > -#define VIRTIO_MMIO(obj) \ > > - OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO) > > - > > -#define VIRT_MAGIC 0x74726976 /* 'virt' */ > > -#define VIRT_VERSION 1 > > -#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */ > > - > > -typedef struct { > > - /* Generic */ > > - SysBusDevice parent_obj; > > - MemoryRegion iomem; > > - qemu_irq irq; > > - /* Guest accessible state needing migration and reset */ > > - uint32_t host_features_sel; > > - uint32_t guest_features_sel; > > - uint32_t guest_page_shift; > > - /* virtio-bus */ > > - VirtioBusState bus; > > - bool format_transport_address; > > -} VirtIOMMIOProxy; > > - > > static bool virtio_mmio_ioeventfd_enabled(DeviceState *d) > > { > > return kvm_eventfds_enabled(); > > diff --git a/hw/virtio/virtio-mmio.h b/hw/virtio/virtio-mmio.h > > new file mode 100644 > > index 0000000000..2f3973f8c7 > > --- /dev/null > > +++ b/hw/virtio/virtio-mmio.h > > @@ -0,0 +1,60 @@ > > +/* > > + * Virtio MMIO bindings > > + * > > + * Copyright (c) 2011 Linaro Limited > > + * > > + * Author: > > + * Peter Maydell <peter.maydell@linaro.org> > > + * > > + * This program is free software; you can redistribute it and/or modify > > + * it under the terms of the GNU General Public License; either version 2 > > + * of the License, or (at your option) any later version. > > + * > > + * This program is distributed in the hope that it will be useful, > > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > > + * GNU General Public License for more details. > > + * > > + * You should have received a copy of the GNU General Public License along > > + * with this program; if not, see <http://www.gnu.org/licenses/>. > > + */ > > + > > +#ifndef QEMU_VIRTIO_MMIO_H > > +#define QEMU_VIRTIO_MMIO_H > > + > > +#include "hw/virtio/virtio-bus.h" > > + > > +/* QOM macros */ > > +/* virtio-mmio-bus */ > > +#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus" > > +#define VIRTIO_MMIO_BUS(obj) \ > > + OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS) > > +#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \ > > + OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS) > > +#define VIRTIO_MMIO_BUS_CLASS(klass) \ > > + OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS) > > + > > +/* virtio-mmio */ > > +#define TYPE_VIRTIO_MMIO "virtio-mmio" > > +#define VIRTIO_MMIO(obj) \ > > + OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO) > > + > > +#define VIRT_MAGIC 0x74726976 /* 'virt' */ > > +#define VIRT_VERSION 1 > > +#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */ > > + > > +typedef struct { > > + /* Generic */ > > + SysBusDevice parent_obj; > > + MemoryRegion iomem; > > + qemu_irq irq; > > + /* Guest accessible state needing migration and reset */ > > + uint32_t host_features_sel; > > + uint32_t guest_features_sel; > > + uint32_t guest_page_shift; > > + /* virtio-bus */ > > + VirtioBusState bus; > > + bool format_transport_address; > > +} VirtIOMMIOProxy; I'm repeating myself, but still: if you insist on virtio mmio, please implement virtio 1 and use that with microvm. We can't keep carrying legacy interface into every new machine type. > > + > > +#endif > > ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/4] hw/virtio: Factorize virtio-mmio headers 2019-07-25 9:58 ` Michael S. Tsirkin @ 2019-07-25 10:03 ` Peter Maydell 2019-07-25 10:36 ` Paolo Bonzini 1 sibling, 0 replies; 68+ messages in thread From: Peter Maydell @ 2019-07-25 10:03 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Eduardo Habkost, Sergio Lopez, maran.wilson, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Richard Henderson, Stefano Garzarella On Thu, 25 Jul 2019 at 10:58, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Thu, Jul 25, 2019 at 10:46:00AM +0100, Liam Merwick wrote: > > On 02/07/2019 13:11, Sergio Lopez wrote: > > > Put QOM and main struct definition in a separate header file, so it > > > can be accesed from other components. > > > > typo: accesed -> accessed > > > > > > > > This is needed for the microvm machine type implementation. > > > > > > Signed-off-by: Sergio Lopez <slp@redhat.com> > > > > One nit below, either way > > > > Reviewed-by: Liam Merwick <liam.merwick@oracle.com> > > > > > --- > > > hw/virtio/virtio-mmio.c | 35 +----------------------- > > > hw/virtio/virtio-mmio.h | 60 +++++++++++++++++++++++++++++++++++++++++ > > > 2 files changed, 61 insertions(+), 34 deletions(-) > > > create mode 100644 hw/virtio/virtio-mmio.h > > > > > > diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c > > > index 97b7f35496..87c7fe4d8d 100644 > > > --- a/hw/virtio/virtio-mmio.c > > > +++ b/hw/virtio/virtio-mmio.c > > > @@ -26,44 +26,11 @@ > > > #include "qemu/host-utils.h" > > > #include "qemu/module.h" > > > #include "sysemu/kvm.h" > > > -#include "hw/virtio/virtio-bus.h" > > > +#include "virtio-mmio.h" > > > > > > Virtually all the other includes of virtio-xxx.h files in hw/virtio use the > > full path - e.g. "hw/virtio/virtio-mmio.h" - maybe do the same to be > > consistent. > > That's for headers under include/. > Local ones are ok with a short name. Yes, but we should put this one into include/ as that fits with our usual arrangement of where we put the headers for devices. > I'm repeating myself, but still: if you insist on virtio mmio, please > implement virtio 1 and use that with microvm. We can't keep carrying > legacy interface into every new machine type. Agreed (but we've had this discussion on another thread, as you say). thanks -- PMM ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 1/4] hw/virtio: Factorize virtio-mmio headers 2019-07-25 9:58 ` Michael S. Tsirkin 2019-07-25 10:03 ` Peter Maydell @ 2019-07-25 10:36 ` Paolo Bonzini 1 sibling, 0 replies; 68+ messages in thread From: Paolo Bonzini @ 2019-07-25 10:36 UTC (permalink / raw) To: Michael S. Tsirkin, Liam Merwick Cc: ehabkost, Sergio Lopez, maran.wilson, qemu-devel, kraxel, rth, sgarzare On 25/07/19 11:58, Michael S. Tsirkin wrote: > I'm repeating myself, but still: if you insist on virtio mmio, please > implement virtio 1 and use that with microvm. We can't keep carrying > legacy interface into every new machine type. I'd give Sergio the benefit of doubt, since so far he's addressed many other review comments---just, one at a time. :) Paolo ^ permalink raw reply [flat|nested] 68+ messages in thread
* [Qemu-devel] [PATCH v3 2/4] hw/i386: Add an Intel MPTable generator 2019-07-02 12:11 [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Sergio Lopez 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 1/4] hw/virtio: Factorize virtio-mmio headers Sergio Lopez @ 2019-07-02 12:11 ` Sergio Lopez 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 3/4] hw/i386: Factorize PVH related functions Sergio Lopez ` (6 subsequent siblings) 8 siblings, 0 replies; 68+ messages in thread From: Sergio Lopez @ 2019-07-02 12:11 UTC (permalink / raw) To: mst, marcel.apfelbaum, pbonzini, rth, ehabkost, maran.wilson, sgarzare, kraxel Cc: qemu-devel, Sergio Lopez Add a helper function (mptable_generate) for generating an Intel MPTable according to version 1.4 of the specification. This is needed for the microvm machine type implementation. Signed-off-by: Sergio Lopez <slp@redhat.com> --- hw/i386/mptable.c | 156 +++++++++++++++++ include/hw/i386/mptable.h | 36 ++++ include/standard-headers/linux/mpspec_def.h | 182 ++++++++++++++++++++ 3 files changed, 374 insertions(+) create mode 100644 hw/i386/mptable.c create mode 100644 include/hw/i386/mptable.h create mode 100644 include/standard-headers/linux/mpspec_def.h diff --git a/hw/i386/mptable.c b/hw/i386/mptable.c new file mode 100644 index 0000000000..cf1e0eef3a --- /dev/null +++ b/hw/i386/mptable.c @@ -0,0 +1,156 @@ +/* + * Intel MPTable generator + * + * Copyright (C) 2019 Red Hat, Inc. + * + * Authors: + * Sergio Lopez <slp@redhat.com> + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "hw/i386/mptable.h" +#include "standard-headers/linux/mpspec_def.h" + +static int mptable_checksum(char *buf, int size) +{ + int i; + int checksum = 0; + + for (i = 0; i < size; i++) { + checksum += buf[i]; + } + + return checksum; +} + +/* + * Generate an MPTable for "ncpus". "apic_id" must be the next available + * APIC ID (last CPU apic_id + 1). "table_base" is the physical location + * in the Guest where the caller intends to write the table, needed to + * fill the "physptr" field from the "mpf_intel" structure. + * + * On success, return a newly allocated buffer, that must be freed by the + * caller using "g_free" when it's no longer needed, and update + * "mptable_size" with the size of the buffer. + */ +char *mptable_generate(int ncpus, int table_base, int *mptable_size) +{ + struct mpf_intel *mpf; + struct mpc_table *table; + struct mpc_cpu *cpu; + struct mpc_bus *bus; + struct mpc_ioapic *ioapic; + struct mpc_intsrc *intsrc; + struct mpc_lintsrc *lintsrc; + const char mpc_signature[] = MPC_SIGNATURE; + const char smp_magic_ident[] = "_MP_"; + char *mptable; + int checksum = 0; + int offset = 0; + int ssize; + int i; + + ssize = sizeof(struct mpf_intel); + mptable = g_malloc0(ssize); + + mpf = (struct mpf_intel *) mptable; + memcpy(mpf->signature, smp_magic_ident, sizeof(smp_magic_ident) - 1); + mpf->length = 1; + mpf->specification = 4; + mpf->physptr = table_base + ssize; + mpf->checksum -= mptable_checksum((char *) mpf, ssize); + offset = ssize + sizeof(struct mpc_table); + + ssize = sizeof(struct mpc_cpu); + for (i = 0; i < ncpus; i++) { + mptable = g_realloc(mptable, offset + ssize); + cpu = (struct mpc_cpu *) (mptable + offset); + cpu->type = MP_PROCESSOR; + cpu->apicid = i; + cpu->apicver = APIC_VERSION; + cpu->cpuflag = CPU_ENABLED; + if (i == 0) { + cpu->cpuflag |= CPU_BOOTPROCESSOR; + } + cpu->cpufeature = CPU_STEPPING; + cpu->featureflag = CPU_FEATURE_APIC | CPU_FEATURE_FPU; + checksum += mptable_checksum((char *) cpu, ssize); + offset += ssize; + } + + ssize = sizeof(struct mpc_bus); + mptable = g_realloc(mptable, offset + ssize); + bus = (struct mpc_bus *) (mptable + offset); + bus->type = MP_BUS; + bus->busid = 0; + memcpy(bus->bustype, BUS_TYPE_ISA, sizeof(BUS_TYPE_ISA) - 1); + checksum += mptable_checksum((char *) bus, ssize); + offset += ssize; + + ssize = sizeof(struct mpc_ioapic); + mptable = g_realloc(mptable, offset + ssize); + ioapic = (struct mpc_ioapic *) (mptable + offset); + ioapic->type = MP_IOAPIC; + ioapic->apicid = ncpus + 1; + ioapic->apicver = APIC_VERSION; + ioapic->flags = MPC_APIC_USABLE; + ioapic->apicaddr = IO_APIC_DEFAULT_PHYS_BASE; + checksum += mptable_checksum((char *) ioapic, ssize); + offset += ssize; + + ssize = sizeof(struct mpc_intsrc); + for (i = 0; i < 16; i++) { + mptable = g_realloc(mptable, offset + ssize); + intsrc = (struct mpc_intsrc *) (mptable + offset); + intsrc->type = MP_INTSRC; + intsrc->irqtype = mp_INT; + intsrc->irqflag = MP_IRQDIR_DEFAULT; + intsrc->srcbus = 0; + intsrc->srcbusirq = i; + intsrc->dstapic = ncpus + 1; + intsrc->dstirq = i; + checksum += mptable_checksum((char *) intsrc, ssize); + offset += ssize; + } + + ssize = sizeof(struct mpc_lintsrc); + mptable = g_realloc(mptable, offset + (ssize * 2)); + lintsrc = (struct mpc_lintsrc *) (mptable + offset); + lintsrc->type = MP_LINTSRC; + lintsrc->irqtype = mp_ExtINT; + lintsrc->irqflag = MP_IRQDIR_DEFAULT; + lintsrc->srcbusid = 0; + lintsrc->srcbusirq = 0; + lintsrc->destapic = 0; + lintsrc->destapiclint = 0; + checksum += mptable_checksum((char *) lintsrc, ssize); + offset += ssize; + + lintsrc = (struct mpc_lintsrc *) (mptable + offset); + lintsrc->type = MP_LINTSRC; + lintsrc->irqtype = mp_NMI; + lintsrc->irqflag = MP_IRQDIR_DEFAULT; + lintsrc->srcbusid = 0; + lintsrc->srcbusirq = 0; + lintsrc->destapic = 0xFF; + lintsrc->destapiclint = 1; + checksum += mptable_checksum((char *) lintsrc, ssize); + offset += ssize; + + ssize = sizeof(struct mpc_table); + table = (struct mpc_table *) (mptable + sizeof(struct mpf_intel)); + memcpy(table->signature, mpc_signature, sizeof(mpc_signature) - 1); + table->length = offset - sizeof(struct mpf_intel); + table->spec = MPC_SPEC; + memcpy(table->oem, MPC_OEM, sizeof(MPC_OEM) - 1); + memcpy(table->productid, MPC_PRODUCT_ID, sizeof(MPC_PRODUCT_ID) - 1); + table->lapic = APIC_DEFAULT_PHYS_BASE; + checksum += mptable_checksum((char *) table, ssize); + table->checksum -= checksum; + + *mptable_size = offset; + return mptable; +} diff --git a/include/hw/i386/mptable.h b/include/hw/i386/mptable.h new file mode 100644 index 0000000000..96a9778bba --- /dev/null +++ b/include/hw/i386/mptable.h @@ -0,0 +1,36 @@ +/* + * Intel MPTable generator + * + * Copyright (C) 2019 Red Hat, Inc. + * + * Authors: + * Sergio Lopez <slp@redhat.com> + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#ifndef HW_I386_MPTABLE_H +#define HW_I386_MPTABLE_H + +#define APIC_VERSION 0x14 +#define CPU_STEPPING 0x600 +#define CPU_FEATURE_APIC 0x200 +#define CPU_FEATURE_FPU 0x001 +#define MPC_SPEC 0x4 + +#define MP_IRQDIR_DEFAULT 0 +#define MP_IRQDIR_HIGH 1 +#define MP_IRQDIR_LOW 3 + +static const char MPC_OEM[] = "QEMU "; +static const char MPC_PRODUCT_ID[] = "000000000000"; +static const char BUS_TYPE_ISA[] = "ISA "; + +#define IO_APIC_DEFAULT_PHYS_BASE 0xfec00000 +#define APIC_DEFAULT_PHYS_BASE 0xfee00000 +#define APIC_VERSION 0x14 + +char *mptable_generate(int ncpus, int table_base, int *mptable_size); + +#endif diff --git a/include/standard-headers/linux/mpspec_def.h b/include/standard-headers/linux/mpspec_def.h new file mode 100644 index 0000000000..6fb923a343 --- /dev/null +++ b/include/standard-headers/linux/mpspec_def.h @@ -0,0 +1,182 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_MPSPEC_DEF_H +#define _ASM_X86_MPSPEC_DEF_H + +/* + * Structure definitions for SMP machines following the + * Intel Multiprocessing Specification 1.1 and 1.4. + */ + +/* + * This tag identifies where the SMP configuration + * information is. + */ + +#define SMP_MAGIC_IDENT (('_'<<24) | ('P'<<16) | ('M'<<8) | '_') + +#ifdef CONFIG_X86_32 +# define MAX_MPC_ENTRY 1024 +#endif + +/* Intel MP Floating Pointer Structure */ +struct mpf_intel { + char signature[4]; /* "_MP_" */ + unsigned int physptr; /* Configuration table address */ + unsigned char length; /* Our length (paragraphs) */ + unsigned char specification; /* Specification version */ + unsigned char checksum; /* Checksum (makes sum 0) */ + unsigned char feature1; /* Standard or configuration ? */ + unsigned char feature2; /* Bit7 set for IMCR|PIC */ + unsigned char feature3; /* Unused (0) */ + unsigned char feature4; /* Unused (0) */ + unsigned char feature5; /* Unused (0) */ +}; + +#define MPC_SIGNATURE "PCMP" + +struct mpc_table { + char signature[4]; + unsigned short length; /* Size of table */ + char spec; /* 0x01 */ + char checksum; + char oem[8]; + char productid[12]; + unsigned int oemptr; /* 0 if not present */ + unsigned short oemsize; /* 0 if not present */ + unsigned short oemcount; + unsigned int lapic; /* APIC address */ + unsigned int reserved; +}; + +/* Followed by entries */ + +#define MP_PROCESSOR 0 +#define MP_BUS 1 +#define MP_IOAPIC 2 +#define MP_INTSRC 3 +#define MP_LINTSRC 4 +/* Used by IBM NUMA-Q to describe node locality */ +#define MP_TRANSLATION 192 + +#define CPU_ENABLED 1 /* Processor is available */ +#define CPU_BOOTPROCESSOR 2 /* Processor is the boot CPU */ + +#define CPU_STEPPING_MASK 0x000F +#define CPU_MODEL_MASK 0x00F0 +#define CPU_FAMILY_MASK 0x0F00 + +struct mpc_cpu { + unsigned char type; + unsigned char apicid; /* Local APIC number */ + unsigned char apicver; /* Its versions */ + unsigned char cpuflag; + unsigned int cpufeature; + unsigned int featureflag; /* CPUID feature value */ + unsigned int reserved[2]; +}; + +struct mpc_bus { + unsigned char type; + unsigned char busid; + unsigned char bustype[6]; +}; + +/* List of Bus Type string values, Intel MP Spec. */ +#define BUSTYPE_EISA "EISA" +#define BUSTYPE_ISA "ISA" +#define BUSTYPE_INTERN "INTERN" /* Internal BUS */ +#define BUSTYPE_MCA "MCA" /* Obsolete */ +#define BUSTYPE_VL "VL" /* Local bus */ +#define BUSTYPE_PCI "PCI" +#define BUSTYPE_PCMCIA "PCMCIA" +#define BUSTYPE_CBUS "CBUS" +#define BUSTYPE_CBUSII "CBUSII" +#define BUSTYPE_FUTURE "FUTURE" +#define BUSTYPE_MBI "MBI" +#define BUSTYPE_MBII "MBII" +#define BUSTYPE_MPI "MPI" +#define BUSTYPE_MPSA "MPSA" +#define BUSTYPE_NUBUS "NUBUS" +#define BUSTYPE_TC "TC" +#define BUSTYPE_VME "VME" +#define BUSTYPE_XPRESS "XPRESS" + +#define MPC_APIC_USABLE 0x01 + +struct mpc_ioapic { + unsigned char type; + unsigned char apicid; + unsigned char apicver; + unsigned char flags; + unsigned int apicaddr; +}; + +struct mpc_intsrc { + unsigned char type; + unsigned char irqtype; + unsigned short irqflag; + unsigned char srcbus; + unsigned char srcbusirq; + unsigned char dstapic; + unsigned char dstirq; +}; + +enum mp_irq_source_types { + mp_INT = 0, + mp_NMI = 1, + mp_SMI = 2, + mp_ExtINT = 3 +}; + +#define MP_IRQPOL_DEFAULT 0x0 +#define MP_IRQPOL_ACTIVE_HIGH 0x1 +#define MP_IRQPOL_RESERVED 0x2 +#define MP_IRQPOL_ACTIVE_LOW 0x3 +#define MP_IRQPOL_MASK 0x3 + +#define MP_IRQTRIG_DEFAULT 0x0 +#define MP_IRQTRIG_EDGE 0x4 +#define MP_IRQTRIG_RESERVED 0x8 +#define MP_IRQTRIG_LEVEL 0xc +#define MP_IRQTRIG_MASK 0xc + +#define MP_APIC_ALL 0xFF + +struct mpc_lintsrc { + unsigned char type; + unsigned char irqtype; + unsigned short irqflag; + unsigned char srcbusid; + unsigned char srcbusirq; + unsigned char destapic; + unsigned char destapiclint; +}; + +#define MPC_OEM_SIGNATURE "_OEM" + +struct mpc_oemtable { + char signature[4]; + unsigned short length; /* Size of table */ + char rev; /* 0x01 */ + char checksum; + char mpc[8]; +}; + +/* + * Default configurations + * + * 1 2 CPU ISA 82489DX + * 2 2 CPU EISA 82489DX neither IRQ 0 timer nor IRQ 13 DMA chaining + * 3 2 CPU EISA 82489DX + * 4 2 CPU MCA 82489DX + * 5 2 CPU ISA+PCI + * 6 2 CPU EISA+PCI + * 7 2 CPU MCA+PCI + */ + +enum mp_bustype { + MP_BUS_ISA = 1, + MP_BUS_EISA, + MP_BUS_PCI, +}; +#endif /* _ASM_X86_MPSPEC_DEF_H */ -- 2.21.0 ^ permalink raw reply related [flat|nested] 68+ messages in thread
* [Qemu-devel] [PATCH v3 3/4] hw/i386: Factorize PVH related functions 2019-07-02 12:11 [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Sergio Lopez 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 1/4] hw/virtio: Factorize virtio-mmio headers Sergio Lopez 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 2/4] hw/i386: Add an Intel MPTable generator Sergio Lopez @ 2019-07-02 12:11 ` Sergio Lopez 2019-07-23 8:39 ` Liam Merwick 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 4/4] hw/i386: Introduce the microvm machine type Sergio Lopez ` (5 subsequent siblings) 8 siblings, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-07-02 12:11 UTC (permalink / raw) To: mst, marcel.apfelbaum, pbonzini, rth, ehabkost, maran.wilson, sgarzare, kraxel Cc: qemu-devel, Sergio Lopez Extract PVH related functions from pc.c, and put them in pvh.c, so they can be shared with other components. Signed-off-by: Sergio Lopez <slp@redhat.com> --- hw/i386/Makefile.objs | 1 + hw/i386/pc.c | 120 +++++------------------------------------- hw/i386/pvh.c | 113 +++++++++++++++++++++++++++++++++++++++ hw/i386/pvh.h | 10 ++++ 4 files changed, 136 insertions(+), 108 deletions(-) create mode 100644 hw/i386/pvh.c create mode 100644 hw/i386/pvh.h diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs index 5d9c9efd5f..c5f20bbd72 100644 --- a/hw/i386/Makefile.objs +++ b/hw/i386/Makefile.objs @@ -1,5 +1,6 @@ obj-$(CONFIG_KVM) += kvm/ obj-y += multiboot.o +obj-y += pvh.o obj-y += pc.o obj-$(CONFIG_I440FX) += pc_piix.o obj-$(CONFIG_Q35) += pc_q35.o diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 3983621f1c..325ec2c1c8 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -42,6 +42,7 @@ #include "hw/loader.h" #include "elf.h" #include "multiboot.h" +#include "pvh.h" #include "hw/timer/mc146818rtc.h" #include "hw/dma/i8257.h" #include "hw/timer/i8254.h" @@ -108,9 +109,6 @@ static struct e820_entry *e820_table; static unsigned e820_entries; struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX}; -/* Physical Address of PVH entry point read from kernel ELF NOTE */ -static size_t pvh_start_addr; - GlobalProperty pc_compat_4_0[] = {}; const size_t pc_compat_4_0_len = G_N_ELEMENTS(pc_compat_4_0); @@ -1061,109 +1059,6 @@ struct setup_data { uint8_t data[0]; } __attribute__((packed)); - -/* - * The entry point into the kernel for PVH boot is different from - * the native entry point. The PVH entry is defined by the x86/HVM - * direct boot ABI and is available in an ELFNOTE in the kernel binary. - * - * This function is passed to load_elf() when it is called from - * load_elfboot() which then additionally checks for an ELF Note of - * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to - * parse the PVH entry address from the ELF Note. - * - * Due to trickery in elf_opts.h, load_elf() is actually available as - * load_elf32() or load_elf64() and this routine needs to be able - * to deal with being called as 32 or 64 bit. - * - * The address of the PVH entry point is saved to the 'pvh_start_addr' - * global variable. (although the entry point is 32-bit, the kernel - * binary can be either 32-bit or 64-bit). - */ -static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64) -{ - size_t *elf_note_data_addr; - - /* Check if ELF Note header passed in is valid */ - if (arg1 == NULL) { - return 0; - } - - if (is64) { - struct elf64_note *nhdr64 = (struct elf64_note *)arg1; - uint64_t nhdr_size64 = sizeof(struct elf64_note); - uint64_t phdr_align = *(uint64_t *)arg2; - uint64_t nhdr_namesz = nhdr64->n_namesz; - - elf_note_data_addr = - ((void *)nhdr64) + nhdr_size64 + - QEMU_ALIGN_UP(nhdr_namesz, phdr_align); - } else { - struct elf32_note *nhdr32 = (struct elf32_note *)arg1; - uint32_t nhdr_size32 = sizeof(struct elf32_note); - uint32_t phdr_align = *(uint32_t *)arg2; - uint32_t nhdr_namesz = nhdr32->n_namesz; - - elf_note_data_addr = - ((void *)nhdr32) + nhdr_size32 + - QEMU_ALIGN_UP(nhdr_namesz, phdr_align); - } - - pvh_start_addr = *elf_note_data_addr; - - return pvh_start_addr; -} - -static bool load_elfboot(const char *kernel_filename, - int kernel_file_size, - uint8_t *header, - size_t pvh_xen_start_addr, - FWCfgState *fw_cfg) -{ - uint32_t flags = 0; - uint32_t mh_load_addr = 0; - uint32_t elf_kernel_size = 0; - uint64_t elf_entry; - uint64_t elf_low, elf_high; - int kernel_size; - - if (ldl_p(header) != 0x464c457f) { - return false; /* no elfboot */ - } - - bool elf_is64 = header[EI_CLASS] == ELFCLASS64; - flags = elf_is64 ? - ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags; - - if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */ - error_report("elfboot unsupported flags = %x", flags); - exit(1); - } - - uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY; - kernel_size = load_elf(kernel_filename, read_pvh_start_addr, - NULL, &elf_note_type, &elf_entry, - &elf_low, &elf_high, 0, I386_ELF_MACHINE, - 0, 0); - - if (kernel_size < 0) { - error_report("Error while loading elf kernel"); - exit(1); - } - mh_load_addr = elf_low; - elf_kernel_size = elf_high - elf_low; - - if (pvh_start_addr == 0) { - error_report("Error loading uncompressed kernel without PVH ELF Note"); - exit(1); - } - fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr); - fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr); - fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size); - - return true; -} - static void load_linux(PCMachineState *pcms, FWCfgState *fw_cfg) { @@ -1203,6 +1098,9 @@ static void load_linux(PCMachineState *pcms, if (ldl_p(header+0x202) == 0x53726448) { protocol = lduw_p(header+0x206); } else { + size_t pvh_start_addr; + uint32_t mh_load_addr = 0; + uint32_t elf_kernel_size = 0; /* * This could be a multiboot kernel. If it is, let's stop treating it * like a Linux kernel. @@ -1220,10 +1118,16 @@ static void load_linux(PCMachineState *pcms, * If load_elfboot() is successful, populate the fw_cfg info. */ if (pcmc->pvh_enabled && - load_elfboot(kernel_filename, kernel_size, - header, pvh_start_addr, fw_cfg)) { + pvh_load_elfboot(kernel_filename, + &mh_load_addr, &elf_kernel_size)) { fclose(f); + pvh_start_addr = pvh_get_start_addr(); + + fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr); + fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr); + fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size); + fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline) + 1); fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline); diff --git a/hw/i386/pvh.c b/hw/i386/pvh.c new file mode 100644 index 0000000000..61623b4533 --- /dev/null +++ b/hw/i386/pvh.c @@ -0,0 +1,113 @@ +/* + * PVH Boot Helper + * + * Copyright (C) 2019 Oracle + * Copyright (C) 2019 Red Hat, Inc + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + * + */ + +#include "qemu/osdep.h" +#include "qemu/units.h" +#include "qemu/error-report.h" +#include "hw/loader.h" +#include "cpu.h" +#include "elf.h" +#include "pvh.h" + +static size_t pvh_start_addr = 0; + +size_t pvh_get_start_addr(void) +{ + return pvh_start_addr; +} + +/* + * The entry point into the kernel for PVH boot is different from + * the native entry point. The PVH entry is defined by the x86/HVM + * direct boot ABI and is available in an ELFNOTE in the kernel binary. + * + * This function is passed to load_elf() when it is called from + * load_elfboot() which then additionally checks for an ELF Note of + * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to + * parse the PVH entry address from the ELF Note. + * + * Due to trickery in elf_opts.h, load_elf() is actually available as + * load_elf32() or load_elf64() and this routine needs to be able + * to deal with being called as 32 or 64 bit. + * + * The address of the PVH entry point is saved to the 'pvh_start_addr' + * global variable. (although the entry point is 32-bit, the kernel + * binary can be either 32-bit or 64-bit). + */ + +static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64) +{ + size_t *elf_note_data_addr; + + /* Check if ELF Note header passed in is valid */ + if (arg1 == NULL) { + return 0; + } + + if (is64) { + struct elf64_note *nhdr64 = (struct elf64_note *)arg1; + uint64_t nhdr_size64 = sizeof(struct elf64_note); + uint64_t phdr_align = *(uint64_t *)arg2; + uint64_t nhdr_namesz = nhdr64->n_namesz; + + elf_note_data_addr = + ((void *)nhdr64) + nhdr_size64 + + QEMU_ALIGN_UP(nhdr_namesz, phdr_align); + } else { + struct elf32_note *nhdr32 = (struct elf32_note *)arg1; + uint32_t nhdr_size32 = sizeof(struct elf32_note); + uint32_t phdr_align = *(uint32_t *)arg2; + uint32_t nhdr_namesz = nhdr32->n_namesz; + + elf_note_data_addr = + ((void *)nhdr32) + nhdr_size32 + + QEMU_ALIGN_UP(nhdr_namesz, phdr_align); + } + + pvh_start_addr = *elf_note_data_addr; + + return pvh_start_addr; +} + +bool pvh_load_elfboot(const char *kernel_filename, + uint32_t *mh_load_addr, + uint32_t *elf_kernel_size) +{ + uint64_t elf_entry; + uint64_t elf_low, elf_high; + int kernel_size; + uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY; + + kernel_size = load_elf(kernel_filename, read_pvh_start_addr, + NULL, &elf_note_type, &elf_entry, + &elf_low, &elf_high, 0, I386_ELF_MACHINE, + 0, 0); + + if (kernel_size < 0) { + error_report("Error while loading elf kernel"); + return false; + } + + if (pvh_start_addr == 0) { + error_report("Error loading uncompressed kernel without PVH ELF Note"); + return false; + } + + if (mh_load_addr) { + *mh_load_addr = elf_low; + } + + if (elf_kernel_size) { + *elf_kernel_size = elf_high - elf_low; + } + + return true; +} diff --git a/hw/i386/pvh.h b/hw/i386/pvh.h new file mode 100644 index 0000000000..ada67ff6e8 --- /dev/null +++ b/hw/i386/pvh.h @@ -0,0 +1,10 @@ +#ifndef HW_I386_PVH_H +#define HW_I386_PVH_H + +size_t pvh_get_start_addr(void); + +bool pvh_load_elfboot(const char *kernel_filename, + uint32_t *mh_load_addr, + uint32_t *elf_kernel_size); + +#endif -- 2.21.0 ^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 3/4] hw/i386: Factorize PVH related functions 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 3/4] hw/i386: Factorize PVH related functions Sergio Lopez @ 2019-07-23 8:39 ` Liam Merwick 0 siblings, 0 replies; 68+ messages in thread From: Liam Merwick @ 2019-07-23 8:39 UTC (permalink / raw) To: Sergio Lopez, mst, marcel.apfelbaum, pbonzini, rth, ehabkost, maran.wilson, sgarzare, kraxel Cc: qemu-devel On 02/07/2019 13:11, Sergio Lopez wrote: > Extract PVH related functions from pc.c, and put them in pvh.c, so > they can be shared with other components. > > Signed-off-by: Sergio Lopez <slp@redhat.com> Refactoring LGTM Reviewed-by: Liam Merwick <liam.merwick@oracle.com> > --- > hw/i386/Makefile.objs | 1 + > hw/i386/pc.c | 120 +++++------------------------------------- > hw/i386/pvh.c | 113 +++++++++++++++++++++++++++++++++++++++ > hw/i386/pvh.h | 10 ++++ > 4 files changed, 136 insertions(+), 108 deletions(-) > create mode 100644 hw/i386/pvh.c > create mode 100644 hw/i386/pvh.h > > diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs > index 5d9c9efd5f..c5f20bbd72 100644 > --- a/hw/i386/Makefile.objs > +++ b/hw/i386/Makefile.objs > @@ -1,5 +1,6 @@ > obj-$(CONFIG_KVM) += kvm/ > obj-y += multiboot.o > +obj-y += pvh.o > obj-y += pc.o > obj-$(CONFIG_I440FX) += pc_piix.o > obj-$(CONFIG_Q35) += pc_q35.o > diff --git a/hw/i386/pc.c b/hw/i386/pc.c > index 3983621f1c..325ec2c1c8 100644 > --- a/hw/i386/pc.c > +++ b/hw/i386/pc.c > @@ -42,6 +42,7 @@ > #include "hw/loader.h" > #include "elf.h" > #include "multiboot.h" > +#include "pvh.h" > #include "hw/timer/mc146818rtc.h" > #include "hw/dma/i8257.h" > #include "hw/timer/i8254.h" > @@ -108,9 +109,6 @@ static struct e820_entry *e820_table; > static unsigned e820_entries; > struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX}; > > -/* Physical Address of PVH entry point read from kernel ELF NOTE */ > -static size_t pvh_start_addr; > - > GlobalProperty pc_compat_4_0[] = {}; > const size_t pc_compat_4_0_len = G_N_ELEMENTS(pc_compat_4_0); > > @@ -1061,109 +1059,6 @@ struct setup_data { > uint8_t data[0]; > } __attribute__((packed)); > > - > -/* > - * The entry point into the kernel for PVH boot is different from > - * the native entry point. The PVH entry is defined by the x86/HVM > - * direct boot ABI and is available in an ELFNOTE in the kernel binary. > - * > - * This function is passed to load_elf() when it is called from > - * load_elfboot() which then additionally checks for an ELF Note of > - * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to > - * parse the PVH entry address from the ELF Note. > - * > - * Due to trickery in elf_opts.h, load_elf() is actually available as > - * load_elf32() or load_elf64() and this routine needs to be able > - * to deal with being called as 32 or 64 bit. > - * > - * The address of the PVH entry point is saved to the 'pvh_start_addr' > - * global variable. (although the entry point is 32-bit, the kernel > - * binary can be either 32-bit or 64-bit). > - */ > -static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64) > -{ > - size_t *elf_note_data_addr; > - > - /* Check if ELF Note header passed in is valid */ > - if (arg1 == NULL) { > - return 0; > - } > - > - if (is64) { > - struct elf64_note *nhdr64 = (struct elf64_note *)arg1; > - uint64_t nhdr_size64 = sizeof(struct elf64_note); > - uint64_t phdr_align = *(uint64_t *)arg2; > - uint64_t nhdr_namesz = nhdr64->n_namesz; > - > - elf_note_data_addr = > - ((void *)nhdr64) + nhdr_size64 + > - QEMU_ALIGN_UP(nhdr_namesz, phdr_align); > - } else { > - struct elf32_note *nhdr32 = (struct elf32_note *)arg1; > - uint32_t nhdr_size32 = sizeof(struct elf32_note); > - uint32_t phdr_align = *(uint32_t *)arg2; > - uint32_t nhdr_namesz = nhdr32->n_namesz; > - > - elf_note_data_addr = > - ((void *)nhdr32) + nhdr_size32 + > - QEMU_ALIGN_UP(nhdr_namesz, phdr_align); > - } > - > - pvh_start_addr = *elf_note_data_addr; > - > - return pvh_start_addr; > -} > - > -static bool load_elfboot(const char *kernel_filename, > - int kernel_file_size, > - uint8_t *header, > - size_t pvh_xen_start_addr, > - FWCfgState *fw_cfg) > -{ > - uint32_t flags = 0; > - uint32_t mh_load_addr = 0; > - uint32_t elf_kernel_size = 0; > - uint64_t elf_entry; > - uint64_t elf_low, elf_high; > - int kernel_size; > - > - if (ldl_p(header) != 0x464c457f) { > - return false; /* no elfboot */ > - } > - > - bool elf_is64 = header[EI_CLASS] == ELFCLASS64; > - flags = elf_is64 ? > - ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags; > - > - if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */ > - error_report("elfboot unsupported flags = %x", flags); > - exit(1); > - } > - > - uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY; > - kernel_size = load_elf(kernel_filename, read_pvh_start_addr, > - NULL, &elf_note_type, &elf_entry, > - &elf_low, &elf_high, 0, I386_ELF_MACHINE, > - 0, 0); > - > - if (kernel_size < 0) { > - error_report("Error while loading elf kernel"); > - exit(1); > - } > - mh_load_addr = elf_low; > - elf_kernel_size = elf_high - elf_low; > - > - if (pvh_start_addr == 0) { > - error_report("Error loading uncompressed kernel without PVH ELF Note"); > - exit(1); > - } > - fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr); > - fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr); > - fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size); > - > - return true; > -} > - > static void load_linux(PCMachineState *pcms, > FWCfgState *fw_cfg) > { > @@ -1203,6 +1098,9 @@ static void load_linux(PCMachineState *pcms, > if (ldl_p(header+0x202) == 0x53726448) { > protocol = lduw_p(header+0x206); > } else { > + size_t pvh_start_addr; > + uint32_t mh_load_addr = 0; > + uint32_t elf_kernel_size = 0; > /* > * This could be a multiboot kernel. If it is, let's stop treating it > * like a Linux kernel. > @@ -1220,10 +1118,16 @@ static void load_linux(PCMachineState *pcms, > * If load_elfboot() is successful, populate the fw_cfg info. > */ > if (pcmc->pvh_enabled && > - load_elfboot(kernel_filename, kernel_size, > - header, pvh_start_addr, fw_cfg)) { > + pvh_load_elfboot(kernel_filename, > + &mh_load_addr, &elf_kernel_size)) { > fclose(f); > > + pvh_start_addr = pvh_get_start_addr(); > + > + fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr); > + fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr); > + fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size); > + > fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, > strlen(kernel_cmdline) + 1); > fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline); > diff --git a/hw/i386/pvh.c b/hw/i386/pvh.c > new file mode 100644 > index 0000000000..61623b4533 > --- /dev/null > +++ b/hw/i386/pvh.c > @@ -0,0 +1,113 @@ > +/* > + * PVH Boot Helper > + * > + * Copyright (C) 2019 Oracle > + * Copyright (C) 2019 Red Hat, Inc > + * > + * This work is licensed under the terms of the GNU GPL, version 2 or later. > + * See the COPYING file in the top-level directory. > + * > + */ > + > +#include "qemu/osdep.h" > +#include "qemu/units.h" > +#include "qemu/error-report.h" > +#include "hw/loader.h" > +#include "cpu.h" > +#include "elf.h" > +#include "pvh.h" > + > +static size_t pvh_start_addr = 0; > + > +size_t pvh_get_start_addr(void) > +{ > + return pvh_start_addr; > +} > + > +/* > + * The entry point into the kernel for PVH boot is different from > + * the native entry point. The PVH entry is defined by the x86/HVM > + * direct boot ABI and is available in an ELFNOTE in the kernel binary. > + * > + * This function is passed to load_elf() when it is called from > + * load_elfboot() which then additionally checks for an ELF Note of > + * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to > + * parse the PVH entry address from the ELF Note. > + * > + * Due to trickery in elf_opts.h, load_elf() is actually available as > + * load_elf32() or load_elf64() and this routine needs to be able > + * to deal with being called as 32 or 64 bit. > + * > + * The address of the PVH entry point is saved to the 'pvh_start_addr' > + * global variable. (although the entry point is 32-bit, the kernel > + * binary can be either 32-bit or 64-bit). > + */ > + > +static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64) > +{ > + size_t *elf_note_data_addr; > + > + /* Check if ELF Note header passed in is valid */ > + if (arg1 == NULL) { > + return 0; > + } > + > + if (is64) { > + struct elf64_note *nhdr64 = (struct elf64_note *)arg1; > + uint64_t nhdr_size64 = sizeof(struct elf64_note); > + uint64_t phdr_align = *(uint64_t *)arg2; > + uint64_t nhdr_namesz = nhdr64->n_namesz; > + > + elf_note_data_addr = > + ((void *)nhdr64) + nhdr_size64 + > + QEMU_ALIGN_UP(nhdr_namesz, phdr_align); > + } else { > + struct elf32_note *nhdr32 = (struct elf32_note *)arg1; > + uint32_t nhdr_size32 = sizeof(struct elf32_note); > + uint32_t phdr_align = *(uint32_t *)arg2; > + uint32_t nhdr_namesz = nhdr32->n_namesz; > + > + elf_note_data_addr = > + ((void *)nhdr32) + nhdr_size32 + > + QEMU_ALIGN_UP(nhdr_namesz, phdr_align); > + } > + > + pvh_start_addr = *elf_note_data_addr; > + > + return pvh_start_addr; > +} > + > +bool pvh_load_elfboot(const char *kernel_filename, > + uint32_t *mh_load_addr, > + uint32_t *elf_kernel_size) > +{ > + uint64_t elf_entry; > + uint64_t elf_low, elf_high; > + int kernel_size; > + uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY; > + > + kernel_size = load_elf(kernel_filename, read_pvh_start_addr, > + NULL, &elf_note_type, &elf_entry, > + &elf_low, &elf_high, 0, I386_ELF_MACHINE, > + 0, 0); > + > + if (kernel_size < 0) { > + error_report("Error while loading elf kernel"); > + return false; > + } > + > + if (pvh_start_addr == 0) { > + error_report("Error loading uncompressed kernel without PVH ELF Note"); > + return false; > + } > + > + if (mh_load_addr) { > + *mh_load_addr = elf_low; > + } > + > + if (elf_kernel_size) { > + *elf_kernel_size = elf_high - elf_low; > + } > + > + return true; > +} > diff --git a/hw/i386/pvh.h b/hw/i386/pvh.h > new file mode 100644 > index 0000000000..ada67ff6e8 > --- /dev/null > +++ b/hw/i386/pvh.h > @@ -0,0 +1,10 @@ > +#ifndef HW_I386_PVH_H > +#define HW_I386_PVH_H > + > +size_t pvh_get_start_addr(void); > + > +bool pvh_load_elfboot(const char *kernel_filename, > + uint32_t *mh_load_addr, > + uint32_t *elf_kernel_size); > + > +#endif > ^ permalink raw reply [flat|nested] 68+ messages in thread
* [Qemu-devel] [PATCH v3 4/4] hw/i386: Introduce the microvm machine type 2019-07-02 12:11 [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Sergio Lopez ` (2 preceding siblings ...) 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 3/4] hw/i386: Factorize PVH related functions Sergio Lopez @ 2019-07-02 12:11 ` Sergio Lopez 2019-07-02 13:58 ` Gerd Hoffmann 2019-07-25 10:47 ` Paolo Bonzini 2019-07-02 15:01 ` [Qemu-devel] [PATCH v3 0/4] " no-reply ` (4 subsequent siblings) 8 siblings, 2 replies; 68+ messages in thread From: Sergio Lopez @ 2019-07-02 12:11 UTC (permalink / raw) To: mst, marcel.apfelbaum, pbonzini, rth, ehabkost, maran.wilson, sgarzare, kraxel Cc: qemu-devel, Sergio Lopez Microvm is a machine type inspired by both NEMU and Firecracker, and constructed after the machine model implemented by the latter. It's main purpose is providing users a KVM-only machine type with fast boot times, minimal attack surface (measured as the number of IO ports and MMIO regions exposed to the Guest) and small footprint (specially when combined with the ongoing QEMU modularization effort). Normally, other than the device support provided by KVM itself, microvm only supports virtio-mmio devices. Microvm also includes a legacy mode, which adds an ISA bus with a 16550A serial port, useful for being able to see the early boot kernel messages. Microvm only supports booting PVH-enabled Linux ELF images. Booting other PVH-enabled kernels may be possible, but due to the lack of ACPI and firmware, we're relying on the command line for specifying the location of the virtio-mmio transports. If there's an interest on using this machine type with other kernels, we'll try to find some kind of middle ground solution. Signed-off-by: Sergio Lopez <slp@redhat.com> --- default-configs/i386-softmmu.mak | 1 + hw/i386/Kconfig | 4 + hw/i386/Makefile.objs | 1 + hw/i386/microvm.c | 550 +++++++++++++++++++++++++++++++ include/hw/i386/microvm.h | 82 +++++ 5 files changed, 638 insertions(+) create mode 100644 hw/i386/microvm.c create mode 100644 include/hw/i386/microvm.h diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak index cd5ea391e8..338f07420f 100644 --- a/default-configs/i386-softmmu.mak +++ b/default-configs/i386-softmmu.mak @@ -26,3 +26,4 @@ CONFIG_ISAPC=y CONFIG_I440FX=y CONFIG_Q35=y CONFIG_ACPI_PCI=y +CONFIG_MICROVM=y diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig index 9817888216..94c565d8db 100644 --- a/hw/i386/Kconfig +++ b/hw/i386/Kconfig @@ -87,6 +87,10 @@ config Q35 select VMMOUSE select FW_CFG_DMA +config MICROVM + bool + select VIRTIO_MMIO + config VTD bool diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs index c5f20bbd72..7bffca413e 100644 --- a/hw/i386/Makefile.objs +++ b/hw/i386/Makefile.objs @@ -4,6 +4,7 @@ obj-y += pvh.o obj-y += pc.o obj-$(CONFIG_I440FX) += pc_piix.o obj-$(CONFIG_Q35) += pc_q35.o +obj-$(CONFIG_MICROVM) += mptable.o microvm.o obj-y += fw_cfg.o pc_sysfw.o obj-y += x86-iommu.o obj-$(CONFIG_VTD) += intel_iommu.o diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c new file mode 100644 index 0000000000..b3b367add1 --- /dev/null +++ b/hw/i386/microvm.c @@ -0,0 +1,550 @@ +/* + * Copyright (c) 2018 Intel Corporation + * Copyright (c) 2019 Red Hat, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#include "qemu/osdep.h" +#include "qemu/error-report.h" +#include "qemu/cutils.h" +#include "qapi/error.h" +#include "qapi/visitor.h" +#include "sysemu/sysemu.h" +#include "sysemu/cpus.h" +#include "sysemu/numa.h" + +#include "hw/loader.h" +#include "hw/nmi.h" +#include "hw/kvm/clock.h" +#include "hw/i386/microvm.h" +#include "hw/i386/pc.h" +#include "target/i386/cpu.h" +#include "hw/timer/i8254.h" +#include "hw/char/serial.h" +#include "hw/i386/topology.h" +#include "hw/virtio/virtio-mmio.h" +#include "hw/i386/mptable.h" + +#include "cpu.h" +#include "elf.h" +#include "pvh.h" +#include "kvm_i386.h" +#include "hw/xen/start_info.h" + +static void microvm_gsi_handler(void *opaque, int n, int level) +{ + qemu_irq *ioapic_irq = opaque; + + qemu_set_irq(ioapic_irq[n], level); +} + +static void microvm_legacy_init(MicrovmMachineState *mms) +{ + ISABus *isa_bus; + GSIState *gsi_state; + qemu_irq *i8259; + int i; + + assert(kvm_irqchip_in_kernel()); + gsi_state = g_malloc0(sizeof(*gsi_state)); + mms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS); + + isa_bus = isa_bus_new(NULL, get_system_memory(), get_system_io(), + &error_abort); + isa_bus_irqs(isa_bus, mms->gsi); + + assert(kvm_pic_in_kernel()); + i8259 = kvm_i8259_init(isa_bus); + + for (i = 0; i < ISA_NUM_IRQS; i++) { + gsi_state->i8259_irq[i] = i8259[i]; + } + + kvm_pit_init(isa_bus, 0x40); + + for (i = 0; i < VIRTIO_NUM_TRANSPORTS; i++) { + int nirq = VIRTIO_IRQ_BASE + i; + ISADevice *isadev = isa_create(isa_bus, TYPE_ISA_SERIAL); + qemu_irq mmio_irq; + + isa_init_irq(isadev, &mmio_irq, nirq); + sysbus_create_simple("virtio-mmio", + VIRTIO_MMIO_BASE + i * 512, + mms->gsi[VIRTIO_IRQ_BASE + i]); + } + + g_free(i8259); + + serial_hds_isa_init(isa_bus, 0, 1); +} + +static void microvm_ioapic_init(MicrovmMachineState *mms) +{ + qemu_irq *ioapic_irq; + DeviceState *ioapic_dev; + SysBusDevice *d; + int i; + + assert(kvm_irqchip_in_kernel()); + ioapic_irq = g_new0(qemu_irq, IOAPIC_NUM_PINS); + kvm_pc_setup_irq_routing(true); + + assert(kvm_ioapic_in_kernel()); + ioapic_dev = qdev_create(NULL, "kvm-ioapic"); + + object_property_add_child(qdev_get_machine(), + "ioapic", OBJECT(ioapic_dev), NULL); + + qdev_init_nofail(ioapic_dev); + d = SYS_BUS_DEVICE(ioapic_dev); + sysbus_mmio_map(d, 0, IO_APIC_DEFAULT_ADDRESS); + + for (i = 0; i < IOAPIC_NUM_PINS; i++) { + ioapic_irq[i] = qdev_get_gpio_in(ioapic_dev, i); + } + + mms->gsi = qemu_allocate_irqs(microvm_gsi_handler, + ioapic_irq, IOAPIC_NUM_PINS); + + for (i = 0; i < VIRTIO_NUM_TRANSPORTS; i++) { + sysbus_create_simple("virtio-mmio", + VIRTIO_MMIO_BASE + i * 512, + mms->gsi[VIRTIO_IRQ_BASE + i]); + } +} + +static void microvm_memory_init(MicrovmMachineState *mms) +{ + MachineState *machine = MACHINE(mms); + MemoryRegion *ram, *ram_below_4g, *ram_above_4g; + MemoryRegion *system_memory = get_system_memory(); + + if (machine->ram_size > MICROVM_MAX_BELOW_4G) { + mms->above_4g_mem_size = machine->ram_size - MICROVM_MAX_BELOW_4G; + mms->below_4g_mem_size = MICROVM_MAX_BELOW_4G; + } else { + mms->above_4g_mem_size = 0; + mms->below_4g_mem_size = machine->ram_size; + } + + ram = g_malloc(sizeof(*ram)); + memory_region_allocate_system_memory(ram, NULL, "microvm.ram", + machine->ram_size); + + ram_below_4g = g_malloc(sizeof(*ram_below_4g)); + memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram, + 0, mms->below_4g_mem_size); + memory_region_add_subregion(system_memory, 0, ram_below_4g); + + e820_add_entry(0, mms->below_4g_mem_size, E820_RAM); + + if (mms->above_4g_mem_size > 0) { + ram_above_4g = g_malloc(sizeof(*ram_above_4g)); + memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram, + mms->below_4g_mem_size, + mms->above_4g_mem_size); + memory_region_add_subregion(system_memory, 0x100000000ULL, + ram_above_4g); + e820_add_entry(0x100000000ULL, mms->above_4g_mem_size, E820_RAM); + } +} + +static void microvm_cpus_init(const char *typename, Error **errp) +{ + int i; + + for (i = 0; i < smp_cpus; i++) { + Object *cpu = NULL; + Error *local_err = NULL; + + cpu = object_new(typename); + + object_property_set_uint(cpu, i, "apic-id", &local_err); + object_property_set_bool(cpu, true, "realized", &local_err); + + object_unref(cpu); + error_propagate(errp, local_err); + } +} + +static void microvm_machine_state_init(MachineState *machine) +{ + MicrovmMachineState *mms = MICROVM_MACHINE(machine); + Error *local_err = NULL; + + if (machine->kernel_filename == NULL) { + error_report("missing kernel image file name, required by microvm"); + exit(1); + } + + microvm_memory_init(mms); + + microvm_cpus_init(machine->cpu_type, &local_err); + if (local_err) { + error_report_err(local_err); + exit(1); + } + + if (mms->legacy) { + microvm_legacy_init(mms); + } else { + microvm_ioapic_init(mms); + } + + kvmclock_create(); + + if (!pvh_load_elfboot(machine->kernel_filename, NULL, NULL)) { + error_report("Error while loading elf kernel"); + exit(1); + } + + if (machine->initrd_filename) { + uint32_t initrd_max; + gsize initrd_size; + gchar *initrd_data; + GError *gerr = NULL; + + if (!g_file_get_contents(machine->initrd_filename, &initrd_data, + &initrd_size, &gerr)) { + error_report("qemu: error reading initrd %s: %s\n", + machine->initrd_filename, gerr->message); + exit(1); + } + + initrd_max = mms->below_4g_mem_size - HIMEM_START; + if (initrd_size >= initrd_max) { + error_report("qemu: initrd is too large, cannot support." + "(max: %"PRIu32", need %"PRId64")\n", + initrd_max, (uint64_t)initrd_size); + exit(1); + } + + address_space_write(&address_space_memory, + HIMEM_START, MEMTXATTRS_UNSPECIFIED, + (uint8_t *) initrd_data, initrd_size); + + g_free(initrd_data); + + mms->initrd_addr = HIMEM_START; + mms->initrd_size = initrd_size; + } + + mms->elf_entry = pvh_get_start_addr(); +} + +static gchar *microvm_get_mmio_cmdline(gchar *name) +{ + gchar *cmdline; + gchar *separator; + long int index; + int ret; + + separator = g_strrstr(name, "."); + if (!separator) { + return NULL; + } + + if (qemu_strtol(separator + 1, NULL, 10, &index) != 0) { + return NULL; + } + + cmdline = g_malloc0(VIRTIO_CMDLINE_MAXLEN); + ret = g_snprintf(cmdline, VIRTIO_CMDLINE_MAXLEN, + " virtio_mmio.device=512@0x%lx:%ld", + VIRTIO_MMIO_BASE + index * 512, + VIRTIO_IRQ_BASE + index); + if (ret < 0 || ret >= VIRTIO_CMDLINE_MAXLEN) { + g_free(cmdline); + return NULL; + } + + return cmdline; +} + +static void microvm_setup_pvh(MicrovmMachineState *mms, + const gchar *kernel_cmdline) +{ + struct hvm_memmap_table_entry *memmap_table; + struct hvm_start_info *start_info; + BusState *bus; + BusChild *kid; + gchar *cmdline; + int cmdline_len; + int memmap_entries; + int i; + + cmdline = g_strdup(kernel_cmdline); + + /* + * Find MMIO transports with attached devices, and add them to the kernel + * command line. + */ + bus = sysbus_get_default(); + QTAILQ_FOREACH(kid, &bus->children, sibling) { + DeviceState *dev = kid->child; + ObjectClass *class = object_get_class(OBJECT(dev)); + + if (class == object_class_by_name(TYPE_VIRTIO_MMIO)) { + VirtIOMMIOProxy *mmio = VIRTIO_MMIO(OBJECT(dev)); + VirtioBusState *mmio_virtio_bus = &mmio->bus; + BusState *mmio_bus = &mmio_virtio_bus->parent_obj; + + if (!QTAILQ_EMPTY(&mmio_bus->children)) { + gchar *mmio_cmdline = microvm_get_mmio_cmdline(mmio_bus->name); + if (mmio_cmdline) { + char *newcmd = g_strjoin(NULL, cmdline, mmio_cmdline, NULL); + g_free(mmio_cmdline); + g_free(cmdline); + cmdline = newcmd; + } + } + } + } + + cmdline_len = strlen(cmdline); + + address_space_write(&address_space_memory, + KERNEL_CMDLINE_START, MEMTXATTRS_UNSPECIFIED, + (uint8_t *) cmdline, cmdline_len); + + g_free(cmdline); + + memmap_entries = e820_get_num_entries(); + memmap_table = g_new0(struct hvm_memmap_table_entry, memmap_entries); + for (i = 0; i < memmap_entries; i++) { + uint64_t address, length; + struct hvm_memmap_table_entry *entry = &memmap_table[i]; + + if (e820_get_entry(i, E820_RAM, &address, &length)) { + entry->addr = address; + entry->size = length; + entry->type = E820_RAM; + entry->reserved = 0; + } + } + + address_space_write(&address_space_memory, + MEMMAP_START, MEMTXATTRS_UNSPECIFIED, + (uint8_t *) memmap_table, + memmap_entries * sizeof(struct hvm_memmap_table_entry)); + + g_free(memmap_table); + + start_info = g_malloc0(sizeof(struct hvm_start_info)); + + start_info->magic = XEN_HVM_START_MAGIC_VALUE; + start_info->version = 1; + + start_info->nr_modules = 0; + start_info->cmdline_paddr = KERNEL_CMDLINE_START; + start_info->memmap_entries = memmap_entries; + start_info->memmap_paddr = MEMMAP_START; + + if (mms->initrd_addr) { + struct hvm_modlist_entry *entry = g_new0(struct hvm_modlist_entry, 1); + + entry->paddr = mms->initrd_addr; + entry->size = mms->initrd_size; + + address_space_write(&address_space_memory, + MODLIST_START, MEMTXATTRS_UNSPECIFIED, + (uint8_t *) entry, + sizeof(struct hvm_modlist_entry)); + g_free(entry); + + start_info->nr_modules = 1; + start_info->modlist_paddr = MODLIST_START; + } else { + start_info->nr_modules = 0; + } + + address_space_write(&address_space_memory, + PVH_START_INFO, MEMTXATTRS_UNSPECIFIED, + (uint8_t *) start_info, + sizeof(struct hvm_start_info)); + + g_free(start_info); +} + +static void microvm_init_page_tables(void) +{ + uint64_t val = 0; + int i; + + val = PDPTE_START | 0x03; + address_space_write(&address_space_memory, + PML4_START, MEMTXATTRS_UNSPECIFIED, + (uint8_t *) &val, 8); + val = PDE_START | 0x03; + address_space_write(&address_space_memory, + PDPTE_START, MEMTXATTRS_UNSPECIFIED, + (uint8_t *) &val, 8); + + for (i = 0; i < 512; i++) { + val = (i << 21) + 0x83; + address_space_write(&address_space_memory, + PDE_START + (i * 8), MEMTXATTRS_UNSPECIFIED, + (uint8_t *) &val, 8); + } +} + +static void microvm_cpu_reset(CPUState *cs, uint64_t elf_entry) +{ + X86CPU *cpu = X86_CPU(cs); + CPUX86State *env = &cpu->env; + struct SegmentCache seg_code = { .selector = 0x8, + .base = 0x0, + .limit = 0xffffffff, + .flags = 0xc09b00 }; + struct SegmentCache seg_data = { .selector = 0x10, + .base = 0x0, + .limit = 0xffffffff, + .flags = 0xc09300 }; + struct SegmentCache seg_tr = { .selector = 0x18, + .base = 0x0, + .limit = 0xffff, + .flags = 0x8b00 }; + + memcpy(&env->segs[R_CS], &seg_code, sizeof(struct SegmentCache)); + memcpy(&env->segs[R_DS], &seg_data, sizeof(struct SegmentCache)); + memcpy(&env->segs[R_ES], &seg_data, sizeof(struct SegmentCache)); + memcpy(&env->segs[R_FS], &seg_data, sizeof(struct SegmentCache)); + memcpy(&env->segs[R_GS], &seg_data, sizeof(struct SegmentCache)); + memcpy(&env->segs[R_SS], &seg_data, sizeof(struct SegmentCache)); + memcpy(&env->tr, &seg_tr, sizeof(struct SegmentCache)); + + env->regs[R_EBX] = PVH_START_INFO; + + cpu_set_pc(cs, elf_entry); + cpu_x86_update_cr3(env, 0); + cpu_x86_update_cr4(env, 0); + cpu_x86_update_cr0(env, CR0_PE_MASK); + + x86_update_hflags(env); +} + +static void microvm_mptable_setup(MicrovmMachineState *mms) +{ + char *mptable; + int size; + + mptable = mptable_generate(smp_cpus, EBDA_START, &size); + address_space_write(&address_space_memory, + EBDA_START, MEMTXATTRS_UNSPECIFIED, + (uint8_t *) mptable, size); + g_free(mptable); +} + +static bool microvm_machine_get_legacy(Object *obj, Error **errp) +{ + MicrovmMachineState *mms = MICROVM_MACHINE(obj); + + return mms->legacy; +} + +static void microvm_machine_set_legacy(Object *obj, bool value, Error **errp) +{ + MicrovmMachineState *mms = MICROVM_MACHINE(obj); + + mms->legacy = value; +} + +static void microvm_machine_reset(void) +{ + MachineState *machine = MACHINE(qdev_get_machine()); + MicrovmMachineState *mms = MICROVM_MACHINE(machine); + CPUState *cs; + X86CPU *cpu; + + qemu_devices_reset(); + + microvm_mptable_setup(mms); + microvm_setup_pvh(mms, machine->kernel_cmdline); + microvm_init_page_tables(); + + CPU_FOREACH(cs) { + cpu = X86_CPU(cs); + + if (cpu->apic_state) { + device_reset(cpu->apic_state); + } + + microvm_cpu_reset(cs, mms->elf_entry); + } +} + +static void x86_nmi(NMIState *n, int cpu_index, Error **errp) +{ + CPUState *cs; + + CPU_FOREACH(cs) { + X86CPU *cpu = X86_CPU(cs); + + if (!cpu->apic_state) { + cpu_interrupt(cs, CPU_INTERRUPT_NMI); + } else { + apic_deliver_nmi(cpu->apic_state); + } + } +} + +static void microvm_class_init(ObjectClass *oc, void *data) +{ + MachineClass *mc = MACHINE_CLASS(oc); + NMIClass *nc = NMI_CLASS(oc); + + mc->init = microvm_machine_state_init; + + mc->family = "microvm_i386"; + mc->desc = "Microvm (i386)"; + mc->units_per_default_bus = 1; + mc->no_floppy = 1; + machine_class_allow_dynamic_sysbus_dev(mc, "sysbus-debugcon"); + machine_class_allow_dynamic_sysbus_dev(mc, "sysbus-debugexit"); + mc->max_cpus = 288; + mc->has_hotpluggable_cpus = false; + mc->auto_enable_numa_with_memhp = false; + mc->default_cpu_type = X86_CPU_TYPE_NAME("host"); + mc->nvdimm_supported = false; + mc->default_machine_opts = "accel=kvm"; + + /* Machine class handlers */ + mc->reset = microvm_machine_reset; + + /* NMI handler */ + nc->nmi_monitor_handler = x86_nmi; + + object_class_property_add_bool(oc, MICROVM_MACHINE_LEGACY, + microvm_machine_get_legacy, + microvm_machine_set_legacy, + &error_abort); +} + +static const TypeInfo microvm_machine_info = { + .name = TYPE_MICROVM_MACHINE, + .parent = TYPE_MACHINE, + .instance_size = sizeof(MicrovmMachineState), + .class_size = sizeof(MicrovmMachineClass), + .class_init = microvm_class_init, + .interfaces = (InterfaceInfo[]) { + { TYPE_NMI }, + { } + }, +}; + +static void microvm_machine_init(void) +{ + type_register_static(µvm_machine_info); +} +type_init(microvm_machine_init); diff --git a/include/hw/i386/microvm.h b/include/hw/i386/microvm.h new file mode 100644 index 0000000000..fd6f370997 --- /dev/null +++ b/include/hw/i386/microvm.h @@ -0,0 +1,82 @@ +/* + * Copyright (c) 2018 Intel Corporation + * Copyright (c) 2019 Red Hat, Inc. + * + * This program is free software; you can redistribute it and/or modify it + * under the terms and conditions of the GNU General Public License, + * version 2 or later, as published by the Free Software Foundation. + * + * This program is distributed in the hope it will be useful, but WITHOUT + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for + * more details. + * + * You should have received a copy of the GNU General Public License along with + * this program. If not, see <http://www.gnu.org/licenses/>. + */ + +#ifndef HW_I386_MICROVM_H +#define HW_I386_MICROVM_H + +#include "qemu-common.h" +#include "exec/hwaddr.h" +#include "qemu/notify.h" + +#include "hw/boards.h" + +/* Microvm memory layout */ +#define PVH_START_INFO 0x6000 +#define MEMMAP_START 0x7000 +#define MODLIST_START 0x7800 +#define BOOT_STACK_POINTER 0x8ff0 +#define PML4_START 0x9000 +#define PDPTE_START 0xa000 +#define PDE_START 0xb000 +#define KERNEL_CMDLINE_START 0x20000 +#define EBDA_START 0x9fc00 +#define HIMEM_START 0x100000 +#define MICROVM_MAX_BELOW_4G 0xe0000000 + +/* Platform virtio definitions */ +#define VIRTIO_MMIO_BASE 0xd0000000 +#define VIRTIO_IRQ_BASE 5 +#define VIRTIO_NUM_TRANSPORTS 8 +#define VIRTIO_CMDLINE_MAXLEN 64 + +/* Machine type options */ +#define MICROVM_MACHINE_LEGACY "legacy" + +typedef struct { + MachineClass parent; + HotplugHandler *(*orig_hotplug_handler)(MachineState *machine, + DeviceState *dev); +} MicrovmMachineClass; + +typedef struct { + MachineState parent; + qemu_irq *gsi; + + /* RAM size */ + ram_addr_t below_4g_mem_size; + ram_addr_t above_4g_mem_size; + + /* Kernel ELF entry. On reset, vCPUs RIP will be set to this */ + uint64_t elf_entry; + + /* Optional initrd start address and size */ + uint64_t initrd_addr; + uint32_t initrd_size; + + /* Legacy mode based on an ISA bus. Useful for debugging */ + bool legacy; +} MicrovmMachineState; + +#define TYPE_MICROVM_MACHINE MACHINE_TYPE_NAME("microvm") +#define MICROVM_MACHINE(obj) \ + OBJECT_CHECK(MicrovmMachineState, (obj), TYPE_MICROVM_MACHINE) +#define MICROVM_MACHINE_GET_CLASS(obj) \ + OBJECT_GET_CLASS(MicrovmMachineClass, obj, TYPE_MICROVM_MACHINE) +#define MICROVM_MACHINE_CLASS(class) \ + OBJECT_CLASS_CHECK(MicrovmMachineClass, class, TYPE_MICROVM_MACHINE) + +#endif -- 2.21.0 ^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 4/4] hw/i386: Introduce the microvm machine type 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 4/4] hw/i386: Introduce the microvm machine type Sergio Lopez @ 2019-07-02 13:58 ` Gerd Hoffmann 2019-07-25 10:47 ` Paolo Bonzini 1 sibling, 0 replies; 68+ messages in thread From: Gerd Hoffmann @ 2019-07-02 13:58 UTC (permalink / raw) To: Sergio Lopez Cc: ehabkost, maran.wilson, mst, qemu-devel, pbonzini, sgarzare, rth Hi, > +#define MICROVM_MAX_BELOW_4G 0xe0000000 > + > +/* Platform virtio definitions */ > +#define VIRTIO_MMIO_BASE 0xd0000000 That isn't going to fly ... I'd also suggest to add a microvm.txt file to docs/ with specification (io memory, io ports, memory layout in pvh mode, in firmware mode, ...) and usage information. cut & paste the bits sprinkled all over in commit messages and cover letter would be a good start. cheers, Gerd ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 4/4] hw/i386: Introduce the microvm machine type 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 4/4] hw/i386: Introduce the microvm machine type Sergio Lopez 2019-07-02 13:58 ` Gerd Hoffmann @ 2019-07-25 10:47 ` Paolo Bonzini 1 sibling, 0 replies; 68+ messages in thread From: Paolo Bonzini @ 2019-07-25 10:47 UTC (permalink / raw) To: Sergio Lopez, mst, marcel.apfelbaum, rth, ehabkost, maran.wilson, sgarzare, kraxel Cc: qemu-devel On 02/07/19 14:11, Sergio Lopez wrote: > +static void microvm_ioapic_init(MicrovmMachineState *mms) > +{ > + qemu_irq *ioapic_irq; > + DeviceState *ioapic_dev; > + SysBusDevice *d; > + int i; > + > + assert(kvm_irqchip_in_kernel()); > + ioapic_irq = g_new0(qemu_irq, IOAPIC_NUM_PINS); > + kvm_pc_setup_irq_routing(true); > + > + assert(kvm_ioapic_in_kernel()); > + ioapic_dev = qdev_create(NULL, "kvm-ioapic"); > + > + object_property_add_child(qdev_get_machine(), > + "ioapic", OBJECT(ioapic_dev), NULL); Please use the userspace IOAPIC instead, using the kernel one is just sweeping the attack surface under the rug. You are also missing the LAPIC device; things are working only because KVM is helpfully creating one for you but it's better to be precise in the description of the hardware. I'd like to have support for non-KVM accelerators (TCG, HAX, HVF) and they should all come for free if you support "-machine kernel_irqchip=off". Finally, I thing we can agree that legacy mode can go away. At the same time I think we should always have: 1) an ISA bus, even if it's mostly empty, since we have fw_cfg on it now 2) an optional RTC accessible via "-machine rtc=on|off", so that the guest can know the current time even if it is running under an accelerator other than KVM (or doesn't have access to kvmclock). 3) possibly, a fake "keyboard controller" device to support reset via port 64h Thanks! Paolo > + qdev_init_nofail(ioapic_dev); > + d = SYS_BUS_DEVICE(ioapic_dev); > + sysbus_mmio_map(d, 0, IO_APIC_DEFAULT_ADDRESS); > + > + for (i = 0; i < IOAPIC_NUM_PINS; i++) { > + ioapic_irq[i] = qdev_get_gpio_in(ioapic_dev, i); > + } > + > + mms->gsi = qemu_allocate_irqs(microvm_gsi_handler, > + ioapic_irq, IOAPIC_NUM_PINS); > + > + for (i = 0; i < VIRTIO_NUM_TRANSPORTS; i++) { > + sysbus_create_simple("virtio-mmio", > + VIRTIO_MMIO_BASE + i * 512, > + mms->gsi[VIRTIO_IRQ_BASE + i]); > + } > +} > + > +static void microvm_memory_init(MicrovmMachineState *mms) > +{ > + MachineState *machine = MACHINE(mms); > + MemoryRegion *ram, *ram_below_4g, *ram_above_4g; > + MemoryRegion *system_memory = get_system_memory(); > + > + if (machine->ram_size > MICROVM_MAX_BELOW_4G) { > + mms->above_4g_mem_size = machine->ram_size - MICROVM_MAX_BELOW_4G; > + mms->below_4g_mem_size = MICROVM_MAX_BELOW_4G; > + } else { > + mms->above_4g_mem_size = 0; > + mms->below_4g_mem_size = machine->ram_size; > + } > + > + ram = g_malloc(sizeof(*ram)); > + memory_region_allocate_system_memory(ram, NULL, "microvm.ram", > + machine->ram_size); > + > + ram_below_4g = g_malloc(sizeof(*ram_below_4g)); > + memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram, > + 0, mms->below_4g_mem_size); > + memory_region_add_subregion(system_memory, 0, ram_below_4g); > + > + e820_add_entry(0, mms->below_4g_mem_size, E820_RAM); > + > + if (mms->above_4g_mem_size > 0) { > + ram_above_4g = g_malloc(sizeof(*ram_above_4g)); > + memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram, > + mms->below_4g_mem_size, > + mms->above_4g_mem_size); > + memory_region_add_subregion(system_memory, 0x100000000ULL, > + ram_above_4g); > + e820_add_entry(0x100000000ULL, mms->above_4g_mem_size, E820_RAM); > + } > +} > + > +static void microvm_cpus_init(const char *typename, Error **errp) > +{ > + int i; > + > + for (i = 0; i < smp_cpus; i++) { > + Object *cpu = NULL; > + Error *local_err = NULL; > + > + cpu = object_new(typename); > + > + object_property_set_uint(cpu, i, "apic-id", &local_err); > + object_property_set_bool(cpu, true, "realized", &local_err); > + > + object_unref(cpu); > + error_propagate(errp, local_err); > + } > +} > + > +static void microvm_machine_state_init(MachineState *machine) > +{ > + MicrovmMachineState *mms = MICROVM_MACHINE(machine); > + Error *local_err = NULL; > + > + if (machine->kernel_filename == NULL) { > + error_report("missing kernel image file name, required by microvm"); > + exit(1); > + } > + > + microvm_memory_init(mms); > + > + microvm_cpus_init(machine->cpu_type, &local_err); > + if (local_err) { > + error_report_err(local_err); > + exit(1); > + } > + > + if (mms->legacy) { > + microvm_legacy_init(mms); > + } else { > + microvm_ioapic_init(mms); > + } > + > + kvmclock_create(); > + > + if (!pvh_load_elfboot(machine->kernel_filename, NULL, NULL)) { > + error_report("Error while loading elf kernel"); > + exit(1); > + } > + > + if (machine->initrd_filename) { > + uint32_t initrd_max; > + gsize initrd_size; > + gchar *initrd_data; > + GError *gerr = NULL; > + > + if (!g_file_get_contents(machine->initrd_filename, &initrd_data, > + &initrd_size, &gerr)) { > + error_report("qemu: error reading initrd %s: %s\n", > + machine->initrd_filename, gerr->message); > + exit(1); > + } > + > + initrd_max = mms->below_4g_mem_size - HIMEM_START; > + if (initrd_size >= initrd_max) { > + error_report("qemu: initrd is too large, cannot support." > + "(max: %"PRIu32", need %"PRId64")\n", > + initrd_max, (uint64_t)initrd_size); > + exit(1); > + } > + > + address_space_write(&address_space_memory, > + HIMEM_START, MEMTXATTRS_UNSPECIFIED, > + (uint8_t *) initrd_data, initrd_size); > + > + g_free(initrd_data); > + > + mms->initrd_addr = HIMEM_START; > + mms->initrd_size = initrd_size; > + } > + > + mms->elf_entry = pvh_get_start_addr(); > +} > + > +static gchar *microvm_get_mmio_cmdline(gchar *name) > +{ > + gchar *cmdline; > + gchar *separator; > + long int index; > + int ret; > + > + separator = g_strrstr(name, "."); > + if (!separator) { > + return NULL; > + } > + > + if (qemu_strtol(separator + 1, NULL, 10, &index) != 0) { > + return NULL; > + } > + > + cmdline = g_malloc0(VIRTIO_CMDLINE_MAXLEN); > + ret = g_snprintf(cmdline, VIRTIO_CMDLINE_MAXLEN, > + " virtio_mmio.device=512@0x%lx:%ld", > + VIRTIO_MMIO_BASE + index * 512, > + VIRTIO_IRQ_BASE + index); > + if (ret < 0 || ret >= VIRTIO_CMDLINE_MAXLEN) { > + g_free(cmdline); > + return NULL; > + } > + > + return cmdline; > +} > + > +static void microvm_setup_pvh(MicrovmMachineState *mms, > + const gchar *kernel_cmdline) > +{ > + struct hvm_memmap_table_entry *memmap_table; > + struct hvm_start_info *start_info; > + BusState *bus; > + BusChild *kid; > + gchar *cmdline; > + int cmdline_len; > + int memmap_entries; > + int i; > + > + cmdline = g_strdup(kernel_cmdline); > + > + /* > + * Find MMIO transports with attached devices, and add them to the kernel > + * command line. > + */ > + bus = sysbus_get_default(); > + QTAILQ_FOREACH(kid, &bus->children, sibling) { > + DeviceState *dev = kid->child; > + ObjectClass *class = object_get_class(OBJECT(dev)); > + > + if (class == object_class_by_name(TYPE_VIRTIO_MMIO)) { > + VirtIOMMIOProxy *mmio = VIRTIO_MMIO(OBJECT(dev)); > + VirtioBusState *mmio_virtio_bus = &mmio->bus; > + BusState *mmio_bus = &mmio_virtio_bus->parent_obj; > + > + if (!QTAILQ_EMPTY(&mmio_bus->children)) { > + gchar *mmio_cmdline = microvm_get_mmio_cmdline(mmio_bus->name); > + if (mmio_cmdline) { > + char *newcmd = g_strjoin(NULL, cmdline, mmio_cmdline, NULL); > + g_free(mmio_cmdline); > + g_free(cmdline); > + cmdline = newcmd; > + } > + } > + } > + } > + > + cmdline_len = strlen(cmdline); > + > + address_space_write(&address_space_memory, > + KERNEL_CMDLINE_START, MEMTXATTRS_UNSPECIFIED, > + (uint8_t *) cmdline, cmdline_len); > + > + g_free(cmdline); > + > + memmap_entries = e820_get_num_entries(); > + memmap_table = g_new0(struct hvm_memmap_table_entry, memmap_entries); > + for (i = 0; i < memmap_entries; i++) { > + uint64_t address, length; > + struct hvm_memmap_table_entry *entry = &memmap_table[i]; > + > + if (e820_get_entry(i, E820_RAM, &address, &length)) { > + entry->addr = address; > + entry->size = length; > + entry->type = E820_RAM; > + entry->reserved = 0; > + } > + } > + > + address_space_write(&address_space_memory, > + MEMMAP_START, MEMTXATTRS_UNSPECIFIED, > + (uint8_t *) memmap_table, > + memmap_entries * sizeof(struct hvm_memmap_table_entry)); > + > + g_free(memmap_table); > + > + start_info = g_malloc0(sizeof(struct hvm_start_info)); > + > + start_info->magic = XEN_HVM_START_MAGIC_VALUE; > + start_info->version = 1; > + > + start_info->nr_modules = 0; > + start_info->cmdline_paddr = KERNEL_CMDLINE_START; > + start_info->memmap_entries = memmap_entries; > + start_info->memmap_paddr = MEMMAP_START; > + > + if (mms->initrd_addr) { > + struct hvm_modlist_entry *entry = g_new0(struct hvm_modlist_entry, 1); > + > + entry->paddr = mms->initrd_addr; > + entry->size = mms->initrd_size; > + > + address_space_write(&address_space_memory, > + MODLIST_START, MEMTXATTRS_UNSPECIFIED, > + (uint8_t *) entry, > + sizeof(struct hvm_modlist_entry)); > + g_free(entry); > + > + start_info->nr_modules = 1; > + start_info->modlist_paddr = MODLIST_START; > + } else { > + start_info->nr_modules = 0; > + } > + > + address_space_write(&address_space_memory, > + PVH_START_INFO, MEMTXATTRS_UNSPECIFIED, > + (uint8_t *) start_info, > + sizeof(struct hvm_start_info)); > + > + g_free(start_info); > +} > + > +static void microvm_init_page_tables(void) > +{ > + uint64_t val = 0; > + int i; > + > + val = PDPTE_START | 0x03; > + address_space_write(&address_space_memory, > + PML4_START, MEMTXATTRS_UNSPECIFIED, > + (uint8_t *) &val, 8); > + val = PDE_START | 0x03; > + address_space_write(&address_space_memory, > + PDPTE_START, MEMTXATTRS_UNSPECIFIED, > + (uint8_t *) &val, 8); > + > + for (i = 0; i < 512; i++) { > + val = (i << 21) + 0x83; > + address_space_write(&address_space_memory, > + PDE_START + (i * 8), MEMTXATTRS_UNSPECIFIED, > + (uint8_t *) &val, 8); > + } > +} > + > +static void microvm_cpu_reset(CPUState *cs, uint64_t elf_entry) > +{ > + X86CPU *cpu = X86_CPU(cs); > + CPUX86State *env = &cpu->env; > + struct SegmentCache seg_code = { .selector = 0x8, > + .base = 0x0, > + .limit = 0xffffffff, > + .flags = 0xc09b00 }; > + struct SegmentCache seg_data = { .selector = 0x10, > + .base = 0x0, > + .limit = 0xffffffff, > + .flags = 0xc09300 }; > + struct SegmentCache seg_tr = { .selector = 0x18, > + .base = 0x0, > + .limit = 0xffff, > + .flags = 0x8b00 }; > + > + memcpy(&env->segs[R_CS], &seg_code, sizeof(struct SegmentCache)); > + memcpy(&env->segs[R_DS], &seg_data, sizeof(struct SegmentCache)); > + memcpy(&env->segs[R_ES], &seg_data, sizeof(struct SegmentCache)); > + memcpy(&env->segs[R_FS], &seg_data, sizeof(struct SegmentCache)); > + memcpy(&env->segs[R_GS], &seg_data, sizeof(struct SegmentCache)); > + memcpy(&env->segs[R_SS], &seg_data, sizeof(struct SegmentCache)); > + memcpy(&env->tr, &seg_tr, sizeof(struct SegmentCache)); > + > + env->regs[R_EBX] = PVH_START_INFO; > + > + cpu_set_pc(cs, elf_entry); > + cpu_x86_update_cr3(env, 0); > + cpu_x86_update_cr4(env, 0); > + cpu_x86_update_cr0(env, CR0_PE_MASK); > + > + x86_update_hflags(env); > +} > + > +static void microvm_mptable_setup(MicrovmMachineState *mms) > +{ > + char *mptable; > + int size; > + > + mptable = mptable_generate(smp_cpus, EBDA_START, &size); > + address_space_write(&address_space_memory, > + EBDA_START, MEMTXATTRS_UNSPECIFIED, > + (uint8_t *) mptable, size); > + g_free(mptable); > +} > + > +static bool microvm_machine_get_legacy(Object *obj, Error **errp) > +{ > + MicrovmMachineState *mms = MICROVM_MACHINE(obj); > + > + return mms->legacy; > +} > + > +static void microvm_machine_set_legacy(Object *obj, bool value, Error **errp) > +{ > + MicrovmMachineState *mms = MICROVM_MACHINE(obj); > + > + mms->legacy = value; > +} > + > +static void microvm_machine_reset(void) > +{ > + MachineState *machine = MACHINE(qdev_get_machine()); > + MicrovmMachineState *mms = MICROVM_MACHINE(machine); > + CPUState *cs; > + X86CPU *cpu; > + > + qemu_devices_reset(); > + > + microvm_mptable_setup(mms); > + microvm_setup_pvh(mms, machine->kernel_cmdline); > + microvm_init_page_tables(); > + > + CPU_FOREACH(cs) { > + cpu = X86_CPU(cs); > + > + if (cpu->apic_state) { > + device_reset(cpu->apic_state); > + } > + > + microvm_cpu_reset(cs, mms->elf_entry); > + } > +} > + > +static void x86_nmi(NMIState *n, int cpu_index, Error **errp) > +{ > + CPUState *cs; > + > + CPU_FOREACH(cs) { > + X86CPU *cpu = X86_CPU(cs); > + > + if (!cpu->apic_state) { > + cpu_interrupt(cs, CPU_INTERRUPT_NMI); > + } else { > + apic_deliver_nmi(cpu->apic_state); > + } > + } > +} > + > +static void microvm_class_init(ObjectClass *oc, void *data) > +{ > + MachineClass *mc = MACHINE_CLASS(oc); > + NMIClass *nc = NMI_CLASS(oc); > + > + mc->init = microvm_machine_state_init; > + > + mc->family = "microvm_i386"; > + mc->desc = "Microvm (i386)"; > + mc->units_per_default_bus = 1; > + mc->no_floppy = 1; > + machine_class_allow_dynamic_sysbus_dev(mc, "sysbus-debugcon"); > + machine_class_allow_dynamic_sysbus_dev(mc, "sysbus-debugexit"); > + mc->max_cpus = 288; > + mc->has_hotpluggable_cpus = false; > + mc->auto_enable_numa_with_memhp = false; > + mc->default_cpu_type = X86_CPU_TYPE_NAME("host"); > + mc->nvdimm_supported = false; > + mc->default_machine_opts = "accel=kvm"; > + > + /* Machine class handlers */ > + mc->reset = microvm_machine_reset; > + > + /* NMI handler */ > + nc->nmi_monitor_handler = x86_nmi; > + > + object_class_property_add_bool(oc, MICROVM_MACHINE_LEGACY, > + microvm_machine_get_legacy, > + microvm_machine_set_legacy, > + &error_abort); > +} > + > +static const TypeInfo microvm_machine_info = { > + .name = TYPE_MICROVM_MACHINE, > + .parent = TYPE_MACHINE, > + .instance_size = sizeof(MicrovmMachineState), > + .class_size = sizeof(MicrovmMachineClass), > + .class_init = microvm_class_init, > + .interfaces = (InterfaceInfo[]) { > + { TYPE_NMI }, > + { } > + }, > +}; > + > +static void microvm_machine_init(void) > +{ > + type_register_static(µvm_machine_info); > +} > +type_init(microvm_machine_init); > diff --git a/include/hw/i386/microvm.h b/include/hw/i386/microvm.h > new file mode 100644 > index 0000000000..fd6f370997 > --- /dev/null > +++ b/include/hw/i386/microvm.h > @@ -0,0 +1,82 @@ > +/* > + * Copyright (c) 2018 Intel Corporation > + * Copyright (c) 2019 Red Hat, Inc. > + * > + * This program is free software; you can redistribute it and/or modify it > + * under the terms and conditions of the GNU General Public License, > + * version 2 or later, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope it will be useful, but WITHOUT > + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or > + * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for > + * more details. > + * > + * You should have received a copy of the GNU General Public License along with > + * this program. If not, see <http://www.gnu.org/licenses/>. > + */ > + > +#ifndef HW_I386_MICROVM_H > +#define HW_I386_MICROVM_H > + > +#include "qemu-common.h" > +#include "exec/hwaddr.h" > +#include "qemu/notify.h" > + > +#include "hw/boards.h" > + > +/* Microvm memory layout */ > +#define PVH_START_INFO 0x6000 > +#define MEMMAP_START 0x7000 > +#define MODLIST_START 0x7800 > +#define BOOT_STACK_POINTER 0x8ff0 > +#define PML4_START 0x9000 > +#define PDPTE_START 0xa000 > +#define PDE_START 0xb000 > +#define KERNEL_CMDLINE_START 0x20000 > +#define EBDA_START 0x9fc00 > +#define HIMEM_START 0x100000 > +#define MICROVM_MAX_BELOW_4G 0xe0000000 > + > +/* Platform virtio definitions */ > +#define VIRTIO_MMIO_BASE 0xd0000000 > +#define VIRTIO_IRQ_BASE 5 > +#define VIRTIO_NUM_TRANSPORTS 8 > +#define VIRTIO_CMDLINE_MAXLEN 64 > + > +/* Machine type options */ > +#define MICROVM_MACHINE_LEGACY "legacy" > + > +typedef struct { > + MachineClass parent; > + HotplugHandler *(*orig_hotplug_handler)(MachineState *machine, > + DeviceState *dev); > +} MicrovmMachineClass; > + > +typedef struct { > + MachineState parent; > + qemu_irq *gsi; > + > + /* RAM size */ > + ram_addr_t below_4g_mem_size; > + ram_addr_t above_4g_mem_size; > + > + /* Kernel ELF entry. On reset, vCPUs RIP will be set to this */ > + uint64_t elf_entry; > + > + /* Optional initrd start address and size */ > + uint64_t initrd_addr; > + uint32_t initrd_size; > + > + /* Legacy mode based on an ISA bus. Useful for debugging */ > + bool legacy; > +} MicrovmMachineState; > + > +#define TYPE_MICROVM_MACHINE MACHINE_TYPE_NAME("microvm") > +#define MICROVM_MACHINE(obj) \ > + OBJECT_CHECK(MicrovmMachineState, (obj), TYPE_MICROVM_MACHINE) > +#define MICROVM_MACHINE_GET_CLASS(obj) \ > + OBJECT_GET_CLASS(MicrovmMachineClass, obj, TYPE_MICROVM_MACHINE) > +#define MICROVM_MACHINE_CLASS(class) \ > + OBJECT_CLASS_CHECK(MicrovmMachineClass, class, TYPE_MICROVM_MACHINE) > + > +#endif > ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-02 12:11 [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Sergio Lopez ` (3 preceding siblings ...) 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 4/4] hw/i386: Introduce the microvm machine type Sergio Lopez @ 2019-07-02 15:01 ` no-reply 2019-07-02 15:23 ` Peter Maydell ` (3 subsequent siblings) 8 siblings, 0 replies; 68+ messages in thread From: no-reply @ 2019-07-02 15:01 UTC (permalink / raw) To: slp Cc: ehabkost, slp, maran.wilson, mst, qemu-devel, kraxel, pbonzini, sgarzare, rth Patchew URL: https://patchew.org/QEMU/20190702121106.28374-1-slp@redhat.com/ Hi, This series seems to have some coding style problems. See output below for more information: Type: series Subject: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Message-id: 20190702121106.28374-1-slp@redhat.com === TEST SCRIPT BEGIN === #!/bin/bash git rev-parse base > /dev/null || exit 0 git config --local diff.renamelimit 0 git config --local diff.renames True git config --local diff.algorithm histogram ./scripts/checkpatch.pl --mailback base.. === TEST SCRIPT END === Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384 From https://github.com/patchew-project/qemu - [tag update] patchew/20190702113414.6896-1-armbru@redhat.com -> patchew/20190702113414.6896-1-armbru@redhat.com Switched to a new branch 'test' 8ebe540 hw/i386: Introduce the microvm machine type ac71c2a hw/i386: Factorize PVH related functions faeccbd hw/i386: Add an Intel MPTable generator 7540b93 hw/virtio: Factorize virtio-mmio headers === OUTPUT BEGIN === 1/4 Checking commit 7540b9358a0f (hw/virtio: Factorize virtio-mmio headers) WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? #66: new file mode 100644 total: 0 errors, 1 warnings, 105 lines checked Patch 1/4 has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. 2/4 Checking commit faeccbd2c589 (hw/i386: Add an Intel MPTable generator) WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? #16: new file mode 100644 total: 0 errors, 1 warnings, 374 lines checked Patch 2/4 has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. 3/4 Checking commit ac71c2af3972 (hw/i386: Factorize PVH related functions) WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? #186: new file mode 100644 ERROR: do not initialise statics to 0 or NULL #210: FILE: hw/i386/pvh.c:20: +static size_t pvh_start_addr = 0; total: 1 errors, 1 warnings, 281 lines checked Patch 3/4 has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. 4/4 Checking commit 8ebe540c4430 (hw/i386: Introduce the microvm machine type) WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? #67: new file mode 100644 ERROR: Error messages should not contain newlines #291: FILE: hw/i386/microvm.c:220: + error_report("qemu: error reading initrd %s: %s\n", ERROR: Error messages should not contain newlines #299: FILE: hw/i386/microvm.c:228: + "(max: %"PRIu32", need %"PRId64")\n", total: 2 errors, 1 warnings, 653 lines checked Patch 4/4 has style problems, please review. If any of these errors are false positives report them to the maintainer, see CHECKPATCH in MAINTAINERS. === OUTPUT END === Test command exited with code: 1 The full log is available at http://patchew.org/logs/20190702121106.28374-1-slp@redhat.com/testing.checkpatch/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-02 12:11 [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Sergio Lopez ` (4 preceding siblings ...) 2019-07-02 15:01 ` [Qemu-devel] [PATCH v3 0/4] " no-reply @ 2019-07-02 15:23 ` Peter Maydell 2019-07-02 17:34 ` Sergio Lopez 2019-07-02 15:30 ` no-reply ` (2 subsequent siblings) 8 siblings, 1 reply; 68+ messages in thread From: Peter Maydell @ 2019-07-02 15:23 UTC (permalink / raw) To: Sergio Lopez Cc: Eduardo Habkost, maran.wilson, Michael S. Tsirkin, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Tue, 2 Jul 2019 at 13:14, Sergio Lopez <slp@redhat.com> wrote: > > Microvm is a machine type inspired by both NEMU and Firecracker, and > constructed after the machine model implemented by the latter. > > It's main purpose is providing users a KVM-only machine type with fast > boot times, minimal attack surface (measured as the number of IO ports > and MMIO regions exposed to the Guest) and small footprint (specially > when combined with the ongoing QEMU modularization effort). > > Normally, other than the device support provided by KVM itself, > microvm only supports virtio-mmio devices. Microvm also includes a > legacy mode, which adds an ISA bus with a 16550A serial port, useful > for being able to see the early boot kernel messages. Could we use virtio-pci instead of virtio-mmio? virtio-mmio is a bit deprecated and tends not to support all the features that virtio-pci does. It was introduced mostly as a stopgap while we didn't have pci support in the aarch64 virt machine, and remains for legacy "we don't like to break existing working setups" rather than as a recommended config for new systems. thanks -- PMM ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-02 15:23 ` Peter Maydell @ 2019-07-02 17:34 ` Sergio Lopez 2019-07-02 18:04 ` Peter Maydell 0 siblings, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-07-02 17:34 UTC (permalink / raw) To: Peter Maydell Cc: Eduardo Habkost, maran.wilson, Michael S. Tsirkin, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson [-- Attachment #1: Type: text/plain, Size: 1429 bytes --] Peter Maydell <peter.maydell@linaro.org> writes: > On Tue, 2 Jul 2019 at 13:14, Sergio Lopez <slp@redhat.com> wrote: >> >> Microvm is a machine type inspired by both NEMU and Firecracker, and >> constructed after the machine model implemented by the latter. >> >> It's main purpose is providing users a KVM-only machine type with fast >> boot times, minimal attack surface (measured as the number of IO ports >> and MMIO regions exposed to the Guest) and small footprint (specially >> when combined with the ongoing QEMU modularization effort). >> >> Normally, other than the device support provided by KVM itself, >> microvm only supports virtio-mmio devices. Microvm also includes a >> legacy mode, which adds an ISA bus with a 16550A serial port, useful >> for being able to see the early boot kernel messages. > > Could we use virtio-pci instead of virtio-mmio? virtio-mmio is > a bit deprecated and tends not to support all the features that > virtio-pci does. It was introduced mostly as a stopgap while we > didn't have pci support in the aarch64 virt machine, and remains > for legacy "we don't like to break existing working setups" rather > than as a recommended config for new systems. Using virtio-pci implies keeping PCI and ACPI support, defeating a significant part of microvm's purpose. What are the issues with the current state of virtio-mmio? Is there a way I can help to improve the situation? Sergio. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-02 17:34 ` Sergio Lopez @ 2019-07-02 18:04 ` Peter Maydell 2019-07-02 22:04 ` Sergio Lopez 0 siblings, 1 reply; 68+ messages in thread From: Peter Maydell @ 2019-07-02 18:04 UTC (permalink / raw) To: Sergio Lopez Cc: Eduardo Habkost, maran.wilson, Michael S. Tsirkin, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Tue, 2 Jul 2019 at 18:34, Sergio Lopez <slp@redhat.com> wrote: > Peter Maydell <peter.maydell@linaro.org> writes: > > Could we use virtio-pci instead of virtio-mmio? virtio-mmio is > > a bit deprecated and tends not to support all the features that > > virtio-pci does. It was introduced mostly as a stopgap while we > > didn't have pci support in the aarch64 virt machine, and remains > > for legacy "we don't like to break existing working setups" rather > > than as a recommended config for new systems. > > Using virtio-pci implies keeping PCI and ACPI support, defeating a > significant part of microvm's purpose. > > What are the issues with the current state of virtio-mmio? Is there a > way I can help to improve the situation? Off the top of my head: * limitations on numbers of devices * no hotplug support * unlike PCI, it's not probeable, so you have to tell the guest where all the transports are using device tree or some similar mechanism * you need one IRQ line per transport, which restricts how many you can have * it's only virtio-0.9, it doesn't support any of the new virtio-1.0 functionality * it is broadly not really maintained in QEMU (and I think not really in the kernel either? not sure), because we'd rather not have to maintain two mechanisms for doing virtio when virtio-pci is clearly better than virtio-mmio thanks -- PMM ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-02 18:04 ` Peter Maydell @ 2019-07-02 22:04 ` Sergio Lopez 2019-07-25 9:59 ` Michael S. Tsirkin 0 siblings, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-07-02 22:04 UTC (permalink / raw) To: Peter Maydell Cc: Eduardo Habkost, maran.wilson, Michael S. Tsirkin, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Tue, Jul 02, 2019 at 07:04:15PM +0100, Peter Maydell wrote: > On Tue, 2 Jul 2019 at 18:34, Sergio Lopez <slp@redhat.com> wrote: > > Peter Maydell <peter.maydell@linaro.org> writes: > > > Could we use virtio-pci instead of virtio-mmio? virtio-mmio is > > > a bit deprecated and tends not to support all the features that > > > virtio-pci does. It was introduced mostly as a stopgap while we > > > didn't have pci support in the aarch64 virt machine, and remains > > > for legacy "we don't like to break existing working setups" rather > > > than as a recommended config for new systems. > > > > Using virtio-pci implies keeping PCI and ACPI support, defeating a > > significant part of microvm's purpose. > > > > What are the issues with the current state of virtio-mmio? Is there a > > way I can help to improve the situation? > > Off the top of my head: > * limitations on numbers of devices > * no hotplug support > * unlike PCI, it's not probeable, so you have to tell the > guest where all the transports are using device tree or > some similar mechanism > * you need one IRQ line per transport, which restricts how > many you can have > * it's only virtio-0.9, it doesn't support any of the new > virtio-1.0 functionality > * it is broadly not really maintained in QEMU (and I think > not really in the kernel either? not sure), because we'd > rather not have to maintain two mechanisms for doing virtio > when virtio-pci is clearly better than virtio-mmio Some of these are design issues, but others can be improved with a bit of work. As for the maintenance burden, I volunteer myself to help with that, so it won't have an impact on other developers and/or projects. Sergio. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-02 22:04 ` Sergio Lopez @ 2019-07-25 9:59 ` Michael S. Tsirkin 2019-07-25 10:05 ` Peter Maydell 0 siblings, 1 reply; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 9:59 UTC (permalink / raw) To: Sergio Lopez Cc: Peter Maydell, Eduardo Habkost, maran.wilson, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Wed, Jul 03, 2019 at 12:04:00AM +0200, Sergio Lopez wrote: > On Tue, Jul 02, 2019 at 07:04:15PM +0100, Peter Maydell wrote: > > On Tue, 2 Jul 2019 at 18:34, Sergio Lopez <slp@redhat.com> wrote: > > > Peter Maydell <peter.maydell@linaro.org> writes: > > > > Could we use virtio-pci instead of virtio-mmio? virtio-mmio is > > > > a bit deprecated and tends not to support all the features that > > > > virtio-pci does. It was introduced mostly as a stopgap while we > > > > didn't have pci support in the aarch64 virt machine, and remains > > > > for legacy "we don't like to break existing working setups" rather > > > > than as a recommended config for new systems. > > > > > > Using virtio-pci implies keeping PCI and ACPI support, defeating a > > > significant part of microvm's purpose. > > > > > > What are the issues with the current state of virtio-mmio? Is there a > > > way I can help to improve the situation? > > > > Off the top of my head: > > * limitations on numbers of devices > > * no hotplug support > > * unlike PCI, it's not probeable, so you have to tell the > > guest where all the transports are using device tree or > > some similar mechanism > > * you need one IRQ line per transport, which restricts how > > many you can have > > * it's only virtio-0.9, it doesn't support any of the new > > virtio-1.0 functionality > > * it is broadly not really maintained in QEMU (and I think > > not really in the kernel either? not sure), because we'd > > rather not have to maintain two mechanisms for doing virtio > > when virtio-pci is clearly better than virtio-mmio > > Some of these are design issues, but others can be improved with a bit > of work. > > As for the maintenance burden, I volunteer myself to help with that, so > it won't have an impact on other developers and/or projects. > > Sergio. OK so please start with adding virtio 1 support. Guest bits have been ready for years now. -- MST ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 9:59 ` Michael S. Tsirkin @ 2019-07-25 10:05 ` Peter Maydell 2019-07-25 10:10 ` Michael S. Tsirkin 2019-07-25 10:42 ` Sergio Lopez 0 siblings, 2 replies; 68+ messages in thread From: Peter Maydell @ 2019-07-25 10:05 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Eduardo Habkost, Sergio Lopez, maran.wilson, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Thu, 25 Jul 2019 at 10:59, Michael S. Tsirkin <mst@redhat.com> wrote: > OK so please start with adding virtio 1 support. Guest bits > have been ready for years now. I'd still rather we just used pci virtio. If pci isn't fast enough at startup, do something to make it faster... thanks -- PMM ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 10:05 ` Peter Maydell @ 2019-07-25 10:10 ` Michael S. Tsirkin 2019-07-25 14:52 ` Sergio Lopez 2019-07-25 10:42 ` Sergio Lopez 1 sibling, 1 reply; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 10:10 UTC (permalink / raw) To: Peter Maydell Cc: Eduardo Habkost, Sergio Lopez, maran.wilson, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 11:05:05AM +0100, Peter Maydell wrote: > On Thu, 25 Jul 2019 at 10:59, Michael S. Tsirkin <mst@redhat.com> wrote: > > OK so please start with adding virtio 1 support. Guest bits > > have been ready for years now. > > I'd still rather we just used pci virtio. If pci isn't > fast enough at startup, do something to make it faster... > > thanks > -- PMM Oh that's putting microvm aside - if we have a maintainer for virtio mmio that's great because it does need a maintainer, and virtio 1 would be the thing to fix before adding features ;) -- MST ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 10:10 ` Michael S. Tsirkin @ 2019-07-25 14:52 ` Sergio Lopez 0 siblings, 0 replies; 68+ messages in thread From: Sergio Lopez @ 2019-07-25 14:52 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Peter Maydell, Eduardo Habkost, maran.wilson, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson [-- Attachment #1: Type: text/plain, Size: 872 bytes --] Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Jul 25, 2019 at 11:05:05AM +0100, Peter Maydell wrote: >> On Thu, 25 Jul 2019 at 10:59, Michael S. Tsirkin <mst@redhat.com> wrote: >> > OK so please start with adding virtio 1 support. Guest bits >> > have been ready for years now. >> >> I'd still rather we just used pci virtio. If pci isn't >> fast enough at startup, do something to make it faster... >> >> thanks >> -- PMM > > Oh that's putting microvm aside - if we have a maintainer for > virtio mmio that's great because it does need a maintainer, > and virtio 1 would be the thing to fix before adding features ;) There seems to be a general consensus that virtio-mmio needs some care, and looking at the specs, implementing virtio-mmio v2/virtio v1 shouldn't be too time consuming, so I'm going to give it a try. Cheers, Sergio. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 10:05 ` Peter Maydell 2019-07-25 10:10 ` Michael S. Tsirkin @ 2019-07-25 10:42 ` Sergio Lopez 2019-07-25 11:23 ` Paolo Bonzini 1 sibling, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-07-25 10:42 UTC (permalink / raw) To: Peter Maydell Cc: Eduardo Habkost, maran.wilson, Michael S. Tsirkin, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson [-- Attachment #1: Type: text/plain, Size: 769 bytes --] Peter Maydell <peter.maydell@linaro.org> writes: > On Thu, 25 Jul 2019 at 10:59, Michael S. Tsirkin <mst@redhat.com> wrote: >> OK so please start with adding virtio 1 support. Guest bits >> have been ready for years now. > > I'd still rather we just used pci virtio. If pci isn't > fast enough at startup, do something to make it faster... Actually, removing PCI (and ACPI), is one of the main ways microvm has to reduce not only boot time, but also the exposed surface and the general footprint. I think we need to discuss and settle whether using virtio-mmio (even if maintained and upgraded to virtio 1) for a new machine type is acceptable or not. Because if it isn't, we should probably just ditch the whole microvm idea and move to something else. Sergio. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 10:42 ` Sergio Lopez @ 2019-07-25 11:23 ` Paolo Bonzini 2019-07-25 12:01 ` Stefan Hajnoczi 0 siblings, 1 reply; 68+ messages in thread From: Paolo Bonzini @ 2019-07-25 11:23 UTC (permalink / raw) To: Sergio Lopez, Peter Maydell Cc: Eduardo Habkost, maran.wilson, Michael S. Tsirkin, QEMU Developers, Gerd Hoffmann, Richard Henderson, Stefano Garzarella On 25/07/19 12:42, Sergio Lopez wrote: > > Peter Maydell <peter.maydell@linaro.org> writes: > >> On Thu, 25 Jul 2019 at 10:59, Michael S. Tsirkin <mst@redhat.com> wrote: >>> OK so please start with adding virtio 1 support. Guest bits >>> have been ready for years now. >> >> I'd still rather we just used pci virtio. If pci isn't >> fast enough at startup, do something to make it faster... > > Actually, removing PCI (and ACPI), is one of the main ways microvm has > to reduce not only boot time, but also the exposed surface and the > general footprint. > > I think we need to discuss and settle whether using virtio-mmio (even if > maintained and upgraded to virtio 1) for a new machine type is > acceptable or not. Because if it isn't, we should probably just ditch > the whole microvm idea and move to something else. I agree. IMNSHO the reduced attack surface from removing PCI is (mostly) security theater, however the boot time numbers that Sergio showed for microvm are quite extreme and I don't think there is any hope of getting even close with a PCI-based virtual machine. So I'd even go a step further: if using virtio-mmio for a new machine type is not acceptable, we should admit that boot time optimization in QEMU is basically as good as it can get---low-hanging fruit has been picked with PVH and mmap is the logical next step, but all that's left is optimizing the guest or something else. I must say that -M microvm took a while to grow on me, but I think it's a great example of how the infrastructure provided by QEMU provides useful features for free, even for the simplest emulated hardware. For example, in v3 microvm could only boot from PVH kernels, but the next firmware-enabled version reuses more of the PC code and thus supports all of vmlinuz, multiboot and PVH. Again: Sergio has been very receptive to feedback and has provided numbers to back the design choices, and we should reciprocate or at least be very clear on the constraints. Paolo ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 11:23 ` Paolo Bonzini @ 2019-07-25 12:01 ` Stefan Hajnoczi 2019-07-25 12:10 ` Michael S. Tsirkin 0 siblings, 1 reply; 68+ messages in thread From: Stefan Hajnoczi @ 2019-07-25 12:01 UTC (permalink / raw) To: Paolo Bonzini Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Michael S. Tsirkin, Maran Wilson, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 12:23 PM Paolo Bonzini <pbonzini@redhat.com> wrote: > On 25/07/19 12:42, Sergio Lopez wrote: > > Peter Maydell <peter.maydell@linaro.org> writes: > >> On Thu, 25 Jul 2019 at 10:59, Michael S. Tsirkin <mst@redhat.com> wrote: > >>> OK so please start with adding virtio 1 support. Guest bits > >>> have been ready for years now. > >> > >> I'd still rather we just used pci virtio. If pci isn't > >> fast enough at startup, do something to make it faster... > > > > Actually, removing PCI (and ACPI), is one of the main ways microvm has > > to reduce not only boot time, but also the exposed surface and the > > general footprint. > > > > I think we need to discuss and settle whether using virtio-mmio (even if > > maintained and upgraded to virtio 1) for a new machine type is > > acceptable or not. Because if it isn't, we should probably just ditch > > the whole microvm idea and move to something else. > > I agree. IMNSHO the reduced attack surface from removing PCI is > (mostly) security theater, however the boot time numbers that Sergio > showed for microvm are quite extreme and I don't think there is any hope > of getting even close with a PCI-based virtual machine. > > So I'd even go a step further: if using virtio-mmio for a new machine > type is not acceptable, we should admit that boot time optimization in > QEMU is basically as good as it can get---low-hanging fruit has been > picked with PVH and mmap is the logical next step, but all that's left > is optimizing the guest or something else. I haven't seen enough analysis to declare boot time optimization done. QEMU startup can be profiled and improved. The numbers show that removing PCI and ACPI makes things faster but this doesn't justify removing them. Understanding of why they are slow is what justifies removing them. Otherwise it could just be a misconfiguration, inefficient implementation, etc and we've seen there is low-hanging fruit. How much time is spent doing PCI initialization? Is the vmexit pattern for PCI initialization as good as the hardware interface allows? Without an analysis of why things are slow it's not possible come to an informed decision. Stefan ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 12:01 ` Stefan Hajnoczi @ 2019-07-25 12:10 ` Michael S. Tsirkin 2019-07-25 13:26 ` Stefan Hajnoczi 0 siblings, 1 reply; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 12:10 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Maran Wilson, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 01:01:29PM +0100, Stefan Hajnoczi wrote: > On Thu, Jul 25, 2019 at 12:23 PM Paolo Bonzini <pbonzini@redhat.com> wrote: > > On 25/07/19 12:42, Sergio Lopez wrote: > > > Peter Maydell <peter.maydell@linaro.org> writes: > > >> On Thu, 25 Jul 2019 at 10:59, Michael S. Tsirkin <mst@redhat.com> wrote: > > >>> OK so please start with adding virtio 1 support. Guest bits > > >>> have been ready for years now. > > >> > > >> I'd still rather we just used pci virtio. If pci isn't > > >> fast enough at startup, do something to make it faster... > > > > > > Actually, removing PCI (and ACPI), is one of the main ways microvm has > > > to reduce not only boot time, but also the exposed surface and the > > > general footprint. > > > > > > I think we need to discuss and settle whether using virtio-mmio (even if > > > maintained and upgraded to virtio 1) for a new machine type is > > > acceptable or not. Because if it isn't, we should probably just ditch > > > the whole microvm idea and move to something else. > > > > I agree. IMNSHO the reduced attack surface from removing PCI is > > (mostly) security theater, however the boot time numbers that Sergio > > showed for microvm are quite extreme and I don't think there is any hope > > of getting even close with a PCI-based virtual machine. > > > > So I'd even go a step further: if using virtio-mmio for a new machine > > type is not acceptable, we should admit that boot time optimization in > > QEMU is basically as good as it can get---low-hanging fruit has been > > picked with PVH and mmap is the logical next step, but all that's left > > is optimizing the guest or something else. > > I haven't seen enough analysis to declare boot time optimization done. > QEMU startup can be profiled and improved. Right, and that will always stay the case. OTOH imho microvm is non-intrusive enough, and small enough, that we'd just put it upstream after addressing low-level comments. This will allow more contributions from people interested in boot time. With no cross-version migration support, or maybe migration disabled completely, maintainance burden should not be too high. Not everyone wants to hack on pci/acpi specifically. > The numbers show that removing PCI and ACPI makes things faster but > this doesn't justify removing them. Understanding of why they are > slow is what justifies removing them. Otherwise it could just be a > misconfiguration, inefficient implementation, etc and we've seen there > is low-hanging fruit. > > How much time is spent doing PCI initialization? Is the vmexit > pattern for PCI initialization as good as the hardware interface > allows? I know in the bios we wanted to use memory mapped for pci config accesses for a very long time now. This makes each vmexit slower but cuts the number of exits by half. Only affects seabios though. > Without an analysis of why things are slow it's not possible come to > an informed decision. > > Stefan ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 12:10 ` Michael S. Tsirkin @ 2019-07-25 13:26 ` Stefan Hajnoczi 2019-07-25 13:43 ` Paolo Bonzini 2019-07-25 13:48 ` Michael S. Tsirkin 0 siblings, 2 replies; 68+ messages in thread From: Stefan Hajnoczi @ 2019-07-25 13:26 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Maran Wilson, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 1:10 PM Michael S. Tsirkin <mst@redhat.com> wrote: > On Thu, Jul 25, 2019 at 01:01:29PM +0100, Stefan Hajnoczi wrote: > > On Thu, Jul 25, 2019 at 12:23 PM Paolo Bonzini <pbonzini@redhat.com> wrote: > > > On 25/07/19 12:42, Sergio Lopez wrote: > > > > Peter Maydell <peter.maydell@linaro.org> writes: > > > >> On Thu, 25 Jul 2019 at 10:59, Michael S. Tsirkin <mst@redhat.com> wrote: > > > >>> OK so please start with adding virtio 1 support. Guest bits > > > >>> have been ready for years now. > > > >> > > > >> I'd still rather we just used pci virtio. If pci isn't > > > >> fast enough at startup, do something to make it faster... > > > > > > > > Actually, removing PCI (and ACPI), is one of the main ways microvm has > > > > to reduce not only boot time, but also the exposed surface and the > > > > general footprint. > > > > > > > > I think we need to discuss and settle whether using virtio-mmio (even if > > > > maintained and upgraded to virtio 1) for a new machine type is > > > > acceptable or not. Because if it isn't, we should probably just ditch > > > > the whole microvm idea and move to something else. > > > > > > I agree. IMNSHO the reduced attack surface from removing PCI is > > > (mostly) security theater, however the boot time numbers that Sergio > > > showed for microvm are quite extreme and I don't think there is any hope > > > of getting even close with a PCI-based virtual machine. > > > > > > So I'd even go a step further: if using virtio-mmio for a new machine > > > type is not acceptable, we should admit that boot time optimization in > > > QEMU is basically as good as it can get---low-hanging fruit has been > > > picked with PVH and mmap is the logical next step, but all that's left > > > is optimizing the guest or something else. > > > > I haven't seen enough analysis to declare boot time optimization done. > > QEMU startup can be profiled and improved. > > Right, and that will always stay the case. The microvm design has a premise and it can be answered definitively through performance analysis. If I had to explain to someone why PCI or ACPI significantly slows things down, I couldn't honestly do so. I say significantly because PCI init definitely requires more vmexits but can it be a small number? For ACPI I have no idea why it would consume significant amounts of time. Until we have this knowledge, the premise of microvm is unproven and merging it would be premature because maybe we can get into the same ballpark by optimizing existing code. I'm sorry for being a pain. I actually think the analysis will support microvm, but it still needs to be done in order to justify it. Stefan ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 13:26 ` Stefan Hajnoczi @ 2019-07-25 13:43 ` Paolo Bonzini 2019-07-25 13:54 ` Michael S. Tsirkin ` (2 more replies) 2019-07-25 13:48 ` Michael S. Tsirkin 1 sibling, 3 replies; 68+ messages in thread From: Paolo Bonzini @ 2019-07-25 13:43 UTC (permalink / raw) To: Stefan Hajnoczi, Michael S. Tsirkin Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Maran Wilson, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson On 25/07/19 15:26, Stefan Hajnoczi wrote: > The microvm design has a premise and it can be answered definitively > through performance analysis. > > If I had to explain to someone why PCI or ACPI significantly slows > things down, I couldn't honestly do so. I say significantly because > PCI init definitely requires more vmexits but can it be a small > number? For ACPI I have no idea why it would consume significant > amounts of time. My guess is that it's just a lot of code that has to run. :( > Until we have this knowledge, the premise of microvm is unproven and > merging it would be premature because maybe we can get into the same > ballpark by optimizing existing code. > > I'm sorry for being a pain. I actually think the analysis will > support microvm, but it still needs to be done in order to justify it. No, you're not a pain, you're explaining your reasoning and that helps. To me *maintainability is the biggest consideration* when introducing a new feature. "We can do just as well with q35" is a good reason to deprecate and delete microvm, but not a good reason to reject it now as long as microvm is good enough in terms of maintainability. Keeping it out of tree only makes it harder to do this kind of experiment. virtio 1 seems to be the biggest remaining blocker and I think it'd be a good thing to have even for the ARM virt machine type. FWIW the "PCI tax" seems to be ~10 ms in QEMU, ~10 ms in the firmware(*) and ~25 ms in the kernel. I must say that's pretty good, but it's still 30% of the whole boot time and reducing it is the hardest part. If having microvm in tree can help reducing it, good. Yes, it will get users, but most likely they will have to support pc or q35 as a fallback so we could still delete microvm at any time with the due deprecation period if it turns out to be a failed experiment. Whether to use qboot or SeaBIOS for microvm is another story, but it's an implementation detail as long as the ROM size doesn't change and/or we don't do versioned machine types. So we can switch from one to the other at any time; we can also include qboot directly in QEMU's tree, without going through a submodule, which also reduces the infrastructure needed (mirrors, etc.) and makes it easier to delete it. Paolo (*) I measured 15ms in SeaBIOS and 5ms in qboot from the first to the last write to 0xcf8. I suspect part of qboot's 10ms boot time actually end up measured as PCI in SeaBIOS, due to different init order, so the real firmware cost of PAM and PCI initialization should be 5ms for qboot and 10ms for SeaBIOS. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 13:43 ` Paolo Bonzini @ 2019-07-25 13:54 ` Michael S. Tsirkin 2019-07-25 14:13 ` Paolo Bonzini 2019-07-25 14:04 ` Peter Maydell 2019-07-25 14:42 ` Sergio Lopez 2 siblings, 1 reply; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 13:54 UTC (permalink / raw) To: Paolo Bonzini Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Maran Wilson, Stefan Hajnoczi, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 03:43:12PM +0200, Paolo Bonzini wrote: > On 25/07/19 15:26, Stefan Hajnoczi wrote: > > The microvm design has a premise and it can be answered definitively > > through performance analysis. > > > > If I had to explain to someone why PCI or ACPI significantly slows > > things down, I couldn't honestly do so. I say significantly because > > PCI init definitely requires more vmexits but can it be a small > > number? For ACPI I have no idea why it would consume significant > > amounts of time. > > My guess is that it's just a lot of code that has to run. :( > > > Until we have this knowledge, the premise of microvm is unproven and > > merging it would be premature because maybe we can get into the same > > ballpark by optimizing existing code. > > > > I'm sorry for being a pain. I actually think the analysis will > > support microvm, but it still needs to be done in order to justify it. > > No, you're not a pain, you're explaining your reasoning and that helps. > > To me *maintainability is the biggest consideration* when introducing a > new feature. "We can do just as well with q35" is a good reason to > deprecate and delete microvm, but not a good reason to reject it now as > long as microvm is good enough in terms of maintainability. Keeping it > out of tree only makes it harder to do this kind of experiment. virtio > 1 seems to be the biggest remaining blocker and I think it'd be a good > thing to have even for the ARM virt machine type. Yep. E.g. virtio-iommu guys wanted that too. > FWIW the "PCI tax" seems to be ~10 ms in QEMU, ~10 ms in the firmware(*) > and ~25 ms in the kernel. How did you measure the qemu time btw? > I must say that's pretty good, but it's still > 30% of the whole boot time and reducing it is the hardest part. If > having microvm in tree can help reducing it, good. Yes, it will get > users, but most likely they will have to support pc or q35 as a fallback > so we could still delete microvm at any time with the due deprecation > period if it turns out to be a failed experiment. > > Whether to use qboot or SeaBIOS for microvm is another story, but it's > an implementation detail as long as the ROM size doesn't change and/or > we don't do versioned machine types. So we can switch from one to the > other at any time; we can also include qboot directly in QEMU's tree, > without going through a submodule, which also reduces the infrastructure > needed (mirrors, etc.) and makes it easier to delete it. > > Paolo > > (*) I measured 15ms in SeaBIOS and 5ms in qboot from the first to the > last write to 0xcf8. I suspect part of qboot's 10ms boot time actually > end up measured as PCI in SeaBIOS, due to different init order, so the > real firmware cost of PAM and PCI initialization should be 5ms for qboot > and 10ms for SeaBIOS. ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 13:54 ` Michael S. Tsirkin @ 2019-07-25 14:13 ` Paolo Bonzini 2019-07-25 14:42 ` Michael S. Tsirkin 0 siblings, 1 reply; 68+ messages in thread From: Paolo Bonzini @ 2019-07-25 14:13 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Maran Wilson, Stefan Hajnoczi, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson On 25/07/19 15:54, Michael S. Tsirkin wrote: >> FWIW the "PCI tax" seems to be ~10 ms in QEMU, ~10 ms in the firmware(*) >> and ~25 ms in the kernel. > How did you measure the qemu time btw? > It's QEMU startup, but not QEMU altogether. For example the time spent in memory.c when a BAR is programmed is not part of those 10 ms. So I just computed q35 qemu startup - microvm qemu startup, it's 65 vs 65 ms. Paolo ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 14:13 ` Paolo Bonzini @ 2019-07-25 14:42 ` Michael S. Tsirkin 0 siblings, 0 replies; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 14:42 UTC (permalink / raw) To: Paolo Bonzini Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Maran Wilson, Stefan Hajnoczi, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 04:13:13PM +0200, Paolo Bonzini wrote: > On 25/07/19 15:54, Michael S. Tsirkin wrote: > >> FWIW the "PCI tax" seems to be ~10 ms in QEMU, ~10 ms in the firmware(*) > >> and ~25 ms in the kernel. > > How did you measure the qemu time btw? > > > > It's QEMU startup, but not QEMU altogether. For example the time spent > in memory.c when a BAR is programmed is not part of those 10 ms. > > So I just computed q35 qemu startup - microvm qemu startup, it's 65 vs > 65 ms. > > Paolo Oh so it could be eventfd or whatever, just as well. I actually wonder whether we spend much time within synchronize_* calls. eventfd triggers this a lot of times. How about ioeventfd=off? Does this speed up things? -- MST ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 13:43 ` Paolo Bonzini 2019-07-25 13:54 ` Michael S. Tsirkin @ 2019-07-25 14:04 ` Peter Maydell 2019-07-25 14:26 ` Paolo Bonzini 2019-07-25 14:42 ` Sergio Lopez 2 siblings, 1 reply; 68+ messages in thread From: Peter Maydell @ 2019-07-25 14:04 UTC (permalink / raw) To: Paolo Bonzini Cc: Eduardo Habkost, Sergio Lopez, Maran Wilson, Stefan Hajnoczi, Michael S. Tsirkin, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson On Thu, 25 Jul 2019 at 14:43, Paolo Bonzini <pbonzini@redhat.com> wrote: > To me *maintainability is the biggest consideration* when introducing a > new feature. "We can do just as well with q35" is a good reason to > deprecate and delete microvm, but not a good reason to reject it now as > long as microvm is good enough in terms of maintainability. I think maintainability matters, but also important is "are we going in the right direction in the first place?". virtio-mmio is (variously deliberately and accidentally) quite a long way behind virtio-pci, and certain kinds of things (hotplug, extensibility beyond a certain number of endpoints) are not going to be possible (either ever, or without a lot of extra design and implementation work to reimplement stuff we have already today with PCI). Are we sure we're not going to end up with a stream of "oh, now we need to implement X for virtio-mmio (that virtio-pci already has)", "users want Y now (that virtio-pci already has)", etc? The other thing is that once we've introduced something we're stuck with whatever it does, because we don't like breaking backwards compatibility. So I think getting the virtio-legacy vs virtio-1 story sorted out before we land microvm is important, at least to the point where we know we haven't backed ourselves into a corner or required a lot of extra effort on transitional-device support that we could have avoided. Which isn't to say that I'm against the microvm approach; just that I'd like us to consider and make a decision on these issues before landing it, rather than just saying "the patches in themselves look good, let's merge it". thanks -- PMM ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 14:04 ` Peter Maydell @ 2019-07-25 14:26 ` Paolo Bonzini 2019-07-25 14:35 ` Michael S. Tsirkin 0 siblings, 1 reply; 68+ messages in thread From: Paolo Bonzini @ 2019-07-25 14:26 UTC (permalink / raw) To: Peter Maydell Cc: Eduardo Habkost, Sergio Lopez, Maran Wilson, Stefan Hajnoczi, Michael S. Tsirkin, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson On 25/07/19 16:04, Peter Maydell wrote: > On Thu, 25 Jul 2019 at 14:43, Paolo Bonzini <pbonzini@redhat.com> wrote: >> To me *maintainability is the biggest consideration* when introducing a >> new feature. "We can do just as well with q35" is a good reason to >> deprecate and delete microvm, but not a good reason to reject it now as >> long as microvm is good enough in terms of maintainability. > > I think maintainability matters, but also important is "are > we going in the right direction in the first place?". > virtio-mmio is (variously deliberately and accidentally) > quite a long way behind virtio-pci, and certain kinds of things > (hotplug, extensibility beyond a certain number of endpoints) > are not going to be possible (either ever, or without a lot > of extra design and implementation work to reimplement stuff > we have already today with PCI). Are we sure we're not going > to end up with a stream of "oh, now we need to implement X for > virtio-mmio (that virtio-pci already has)", "users want Y now > (that virtio-pci already has)", etc? I think this is part of maintainability in a wider sense. For every missing feature there should be a good reason why it's not needed. And if there is already code to do that in QEMU, then there should be an excellent reason why it's not being used. (This was the essence of the firmware debate). So for microvm you could do without hotplug because the idea is that you just tear down the VM and restart it. Lack of MSI is actually what worries me the most, but we could say that microvm clients generally have little multiprocessing so it's not common to have multiple network flows at the same time and so you don't need multiqueue. For microvm in particular there are two reasons why we can take some shortcuts (but with care): - we won't support versioned machine types for microvm. microvm guests die every time you upgrade QEMU, by design. So this is not another QED, which implemented more features than qcow2 but did so at the wrong place of the stack. In fact it's exactly the opposite (it implements less features, so that the implementation of e.g. q35 or PCI is untouched and does not need one-off boot time optimization hacks) - we know that Amazon is using something very similar to microvm in production, with virtio-mmio, so the feature set is at least usable for something. > The other thing is that once we've introduced something we're > stuck with whatever it does, because we don't like breaking > backwards compatibility. So I think getting the virtio-legacy > vs virtio-1 story sorted out before we land microvm is > important, at least to the point where we know we haven't > backed ourselves into a corner or required a lot of extra > effort on transitional-device support that we could have > avoided. Even though we won't support versioned machine types, I think there is agreement that virtio 0.9 is a bad idea and should be fixed. Paolo > Which isn't to say that I'm against the microvm approach; > just that I'd like us to consider and make a decision on > these issues before landing it, rather than just saying > "the patches in themselves look good, let's merge it". > > thanks > -- PMM > ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 14:26 ` Paolo Bonzini @ 2019-07-25 14:35 ` Michael S. Tsirkin 0 siblings, 0 replies; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 14:35 UTC (permalink / raw) To: Paolo Bonzini Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Maran Wilson, Stefan Hajnoczi, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 04:26:42PM +0200, Paolo Bonzini wrote: > On 25/07/19 16:04, Peter Maydell wrote: > > On Thu, 25 Jul 2019 at 14:43, Paolo Bonzini <pbonzini@redhat.com> wrote: > >> To me *maintainability is the biggest consideration* when introducing a > >> new feature. "We can do just as well with q35" is a good reason to > >> deprecate and delete microvm, but not a good reason to reject it now as > >> long as microvm is good enough in terms of maintainability. > > > > I think maintainability matters, but also important is "are > > we going in the right direction in the first place?". > > virtio-mmio is (variously deliberately and accidentally) > > quite a long way behind virtio-pci, and certain kinds of things > > (hotplug, extensibility beyond a certain number of endpoints) > > are not going to be possible (either ever, or without a lot > > of extra design and implementation work to reimplement stuff > > we have already today with PCI). Are we sure we're not going > > to end up with a stream of "oh, now we need to implement X for > > virtio-mmio (that virtio-pci already has)", "users want Y now > > (that virtio-pci already has)", etc? > > I think this is part of maintainability in a wider sense. For every > missing feature there should be a good reason why it's not needed. And > if there is already code to do that in QEMU, then there should be an > excellent reason why it's not being used. (This was the essence of the > firmware debate). > > So for microvm you could do without hotplug because the idea is that you > just tear down the VM and restart it. Lack of MSI is actually what > worries me the most, but we could say that microvm clients generally > have little multiprocessing so it's not common to have multiple network > flows at the same time and so you don't need multiqueue. Me too, and in fact someone just posted virtio-mmio: support multiple interrupt vectors > For microvm in particular there are two reasons why we can take some > shortcuts (but with care): > > - we won't support versioned machine types for microvm. microvm guests > die every time you upgrade QEMU, by design. So this is not another QED, > which implemented more features than qcow2 but did so at the wrong place > of the stack. In fact it's exactly the opposite (it implements less > features, so that the implementation of e.g. q35 or PCI is untouched and > does not need one-off boot time optimization hacks) > > - we know that Amazon is using something very similar to microvm in > production, with virtio-mmio, so the feature set is at least usable for > something. > > > The other thing is that once we've introduced something we're > > stuck with whatever it does, because we don't like breaking > > backwards compatibility. So I think getting the virtio-legacy > > vs virtio-1 story sorted out before we land microvm is > > important, at least to the point where we know we haven't > > backed ourselves into a corner or required a lot of extra > > effort on transitional-device support that we could have > > avoided. > > Even though we won't support versioned machine types, I think there is > agreement that virtio 0.9 is a bad idea and should be fixed. > > Paolo Right, for the simple reason that mmio does not support transitional devices, only transitional drivers. So if we commit to supporting old guests, we won't be able to back out of that. > > Which isn't to say that I'm against the microvm approach; > > just that I'd like us to consider and make a decision on > > these issues before landing it, rather than just saying > > "the patches in themselves look good, let's merge it". > > > > thanks > > -- PMM > > ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 13:43 ` Paolo Bonzini 2019-07-25 13:54 ` Michael S. Tsirkin 2019-07-25 14:04 ` Peter Maydell @ 2019-07-25 14:42 ` Sergio Lopez 2019-07-25 14:58 ` Michael S. Tsirkin 2 siblings, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-07-25 14:42 UTC (permalink / raw) To: Paolo Bonzini Cc: Peter Maydell, Eduardo Habkost, Maran Wilson, Stefan Hajnoczi, Michael S. Tsirkin, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson [-- Attachment #1: Type: text/plain, Size: 3235 bytes --] Paolo Bonzini <pbonzini@redhat.com> writes: > On 25/07/19 15:26, Stefan Hajnoczi wrote: >> The microvm design has a premise and it can be answered definitively >> through performance analysis. >> >> If I had to explain to someone why PCI or ACPI significantly slows >> things down, I couldn't honestly do so. I say significantly because >> PCI init definitely requires more vmexits but can it be a small >> number? For ACPI I have no idea why it would consume significant >> amounts of time. > > My guess is that it's just a lot of code that has to run. :( I think I haven't shared any numbers about ACPI. I don't have details about where exactly the time is spent, but compiling a guest kernel without ACPI decreases the average boot time in ~12ms, and the kernel's unstripped ELF binary size goes down in a whooping ~300KiB. On the other hand, removing ACPI from QEMU decreases its initialization time in ~5ms, and the binary size is ~183KiB smaller. IMHO, those are pretty relevant savings on both fronts. >> Until we have this knowledge, the premise of microvm is unproven and >> merging it would be premature because maybe we can get into the same >> ballpark by optimizing existing code. >> >> I'm sorry for being a pain. I actually think the analysis will >> support microvm, but it still needs to be done in order to justify it. > > No, you're not a pain, you're explaining your reasoning and that helps. > > To me *maintainability is the biggest consideration* when introducing a > new feature. "We can do just as well with q35" is a good reason to > deprecate and delete microvm, but not a good reason to reject it now as > long as microvm is good enough in terms of maintainability. Keeping it > out of tree only makes it harder to do this kind of experiment. virtio > 1 seems to be the biggest remaining blocker and I think it'd be a good > thing to have even for the ARM virt machine type. > > FWIW the "PCI tax" seems to be ~10 ms in QEMU, ~10 ms in the firmware(*) > and ~25 ms in the kernel. I must say that's pretty good, but it's still > 30% of the whole boot time and reducing it is the hardest part. If > having microvm in tree can help reducing it, good. Yes, it will get > users, but most likely they will have to support pc or q35 as a fallback > so we could still delete microvm at any time with the due deprecation > period if it turns out to be a failed experiment. > > Whether to use qboot or SeaBIOS for microvm is another story, but it's > an implementation detail as long as the ROM size doesn't change and/or > we don't do versioned machine types. So we can switch from one to the > other at any time; we can also include qboot directly in QEMU's tree, > without going through a submodule, which also reduces the infrastructure > needed (mirrors, etc.) and makes it easier to delete it. > > Paolo > > (*) I measured 15ms in SeaBIOS and 5ms in qboot from the first to the > last write to 0xcf8. I suspect part of qboot's 10ms boot time actually > end up measured as PCI in SeaBIOS, due to different init order, so the > real firmware cost of PAM and PCI initialization should be 5ms for qboot > and 10ms for SeaBIOS. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 14:42 ` Sergio Lopez @ 2019-07-25 14:58 ` Michael S. Tsirkin 2019-07-25 15:01 ` Michael S. Tsirkin 0 siblings, 1 reply; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 14:58 UTC (permalink / raw) To: Sergio Lopez Cc: Peter Maydell, Eduardo Habkost, Maran Wilson, Stefan Hajnoczi, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 04:42:42PM +0200, Sergio Lopez wrote: > > Paolo Bonzini <pbonzini@redhat.com> writes: > > > On 25/07/19 15:26, Stefan Hajnoczi wrote: > >> The microvm design has a premise and it can be answered definitively > >> through performance analysis. > >> > >> If I had to explain to someone why PCI or ACPI significantly slows > >> things down, I couldn't honestly do so. I say significantly because > >> PCI init definitely requires more vmexits but can it be a small > >> number? For ACPI I have no idea why it would consume significant > >> amounts of time. > > > > My guess is that it's just a lot of code that has to run. :( > > I think I haven't shared any numbers about ACPI. > > I don't have details about where exactly the time is spent, but > compiling a guest kernel without ACPI decreases the average boot time in > ~12ms, and the kernel's unstripped ELF binary size goes down in a > whooping ~300KiB. At least the binary size is hardly surprising. I'm guessing you built in lots of drivers. It would be educational to try to enable ACPI core but disable all optional features. > On the other hand, removing ACPI from QEMU decreases its initialization > time in ~5ms, and the binary size is ~183KiB smaller. Yes - ACPI generation uses a ton of allocations and data copies. Need to play with pre-allocation strategies. Maybe something as simple as: diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index f3fdfefcd5..24becc069e 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -2629,8 +2629,10 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine) acpi_get_pci_holes(&pci_hole, &pci_hole64); acpi_get_slic_oem(&slic_oem); +#define DEFAULT_ARRAY_SIZE 16 table_offsets = g_array_new(false, true /* clear */, - sizeof(uint32_t)); + sizeof(uint32_t), + DEFAULT_ARRAY_SIZE); ACPI_BUILD_DPRINTF("init ACPI tables\n"); bios_linker_loader_alloc(tables->linker, will already help a bit. > > IMHO, those are pretty relevant savings on both fronts. > > >> Until we have this knowledge, the premise of microvm is unproven and > >> merging it would be premature because maybe we can get into the same > >> ballpark by optimizing existing code. > >> > >> I'm sorry for being a pain. I actually think the analysis will > >> support microvm, but it still needs to be done in order to justify it. > > > > No, you're not a pain, you're explaining your reasoning and that helps. > > > > To me *maintainability is the biggest consideration* when introducing a > > new feature. "We can do just as well with q35" is a good reason to > > deprecate and delete microvm, but not a good reason to reject it now as > > long as microvm is good enough in terms of maintainability. Keeping it > > out of tree only makes it harder to do this kind of experiment. virtio > > 1 seems to be the biggest remaining blocker and I think it'd be a good > > thing to have even for the ARM virt machine type. > > > > FWIW the "PCI tax" seems to be ~10 ms in QEMU, ~10 ms in the firmware(*) > > and ~25 ms in the kernel. I must say that's pretty good, but it's still > > 30% of the whole boot time and reducing it is the hardest part. If > > having microvm in tree can help reducing it, good. Yes, it will get > > users, but most likely they will have to support pc or q35 as a fallback > > so we could still delete microvm at any time with the due deprecation > > period if it turns out to be a failed experiment. > > > > Whether to use qboot or SeaBIOS for microvm is another story, but it's > > an implementation detail as long as the ROM size doesn't change and/or > > we don't do versioned machine types. So we can switch from one to the > > other at any time; we can also include qboot directly in QEMU's tree, > > without going through a submodule, which also reduces the infrastructure > > needed (mirrors, etc.) and makes it easier to delete it. > > > > Paolo > > > > (*) I measured 15ms in SeaBIOS and 5ms in qboot from the first to the > > last write to 0xcf8. I suspect part of qboot's 10ms boot time actually > > end up measured as PCI in SeaBIOS, due to different init order, so the > > real firmware cost of PAM and PCI initialization should be 5ms for qboot > > and 10ms for SeaBIOS. > ^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 14:58 ` Michael S. Tsirkin @ 2019-07-25 15:01 ` Michael S. Tsirkin 2019-07-25 15:39 ` Paolo Bonzini 2019-07-25 15:49 ` Sergio Lopez 0 siblings, 2 replies; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 15:01 UTC (permalink / raw) To: Sergio Lopez Cc: Peter Maydell, Eduardo Habkost, Maran Wilson, Stefan Hajnoczi, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 10:58:22AM -0400, Michael S. Tsirkin wrote: > On Thu, Jul 25, 2019 at 04:42:42PM +0200, Sergio Lopez wrote: > > > > Paolo Bonzini <pbonzini@redhat.com> writes: > > > > > On 25/07/19 15:26, Stefan Hajnoczi wrote: > > >> The microvm design has a premise and it can be answered definitively > > >> through performance analysis. > > >> > > >> If I had to explain to someone why PCI or ACPI significantly slows > > >> things down, I couldn't honestly do so. I say significantly because > > >> PCI init definitely requires more vmexits but can it be a small > > >> number? For ACPI I have no idea why it would consume significant > > >> amounts of time. > > > > > > My guess is that it's just a lot of code that has to run. :( > > > > I think I haven't shared any numbers about ACPI. > > > > I don't have details about where exactly the time is spent, but > > compiling a guest kernel without ACPI decreases the average boot time in > > ~12ms, and the kernel's unstripped ELF binary size goes down in a > > whooping ~300KiB. > > At least the binary size is hardly surprising. > > I'm guessing you built in lots of drivers. > > It would be educational to try to enable ACPI core but disable all > optional features. Trying with ACPI_REDUCED_HARDWARE_ONLY would also be educational. > > > On the other hand, removing ACPI from QEMU decreases its initialization > > time in ~5ms, and the binary size is ~183KiB smaller. > > Yes - ACPI generation uses a ton of allocations and data copies. > > Need to play with pre-allocation strategies. Maybe something > as simple as: > > diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c > index f3fdfefcd5..24becc069e 100644 > --- a/hw/i386/acpi-build.c > +++ b/hw/i386/acpi-build.c > @@ -2629,8 +2629,10 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine) > acpi_get_pci_holes(&pci_hole, &pci_hole64); > acpi_get_slic_oem(&slic_oem); > > +#define DEFAULT_ARRAY_SIZE 16 > table_offsets = g_array_new(false, true /* clear */, > - sizeof(uint32_t)); > + sizeof(uint32_t), > + DEFAULT_ARRAY_SIZE); > ACPI_BUILD_DPRINTF("init ACPI tables\n"); > > bios_linker_loader_alloc(tables->linker, > > will already help a bit. > > > > > IMHO, those are pretty relevant savings on both fronts. > > > > >> Until we have this knowledge, the premise of microvm is unproven and > > >> merging it would be premature because maybe we can get into the same > > >> ballpark by optimizing existing code. > > >> > > >> I'm sorry for being a pain. I actually think the analysis will > > >> support microvm, but it still needs to be done in order to justify it. > > > > > > No, you're not a pain, you're explaining your reasoning and that helps. > > > > > > To me *maintainability is the biggest consideration* when introducing a > > > new feature. "We can do just as well with q35" is a good reason to > > > deprecate and delete microvm, but not a good reason to reject it now as > > > long as microvm is good enough in terms of maintainability. Keeping it > > > out of tree only makes it harder to do this kind of experiment. virtio > > > 1 seems to be the biggest remaining blocker and I think it'd be a good > > > thing to have even for the ARM virt machine type. > > > > > > FWIW the "PCI tax" seems to be ~10 ms in QEMU, ~10 ms in the firmware(*) > > > and ~25 ms in the kernel. I must say that's pretty good, but it's still > > > 30% of the whole boot time and reducing it is the hardest part. If > > > having microvm in tree can help reducing it, good. Yes, it will get > > > users, but most likely they will have to support pc or q35 as a fallback > > > so we could still delete microvm at any time with the due deprecation > > > period if it turns out to be a failed experiment. > > > > > > Whether to use qboot or SeaBIOS for microvm is another story, but it's > > > an implementation detail as long as the ROM size doesn't change and/or > > > we don't do versioned machine types. So we can switch from one to the > > > other at any time; we can also include qboot directly in QEMU's tree, > > > without going through a submodule, which also reduces the infrastructure > > > needed (mirrors, etc.) and makes it easier to delete it. > > > > > > Paolo > > > > > > (*) I measured 15ms in SeaBIOS and 5ms in qboot from the first to the > > > last write to 0xcf8. I suspect part of qboot's 10ms boot time actually > > > end up measured as PCI in SeaBIOS, due to different init order, so the > > > real firmware cost of PAM and PCI initialization should be 5ms for qboot > > > and 10ms for SeaBIOS. > > > > ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 15:01 ` Michael S. Tsirkin @ 2019-07-25 15:39 ` Paolo Bonzini 2019-07-25 17:38 ` Michael S. Tsirkin 2019-07-25 15:49 ` Sergio Lopez 1 sibling, 1 reply; 68+ messages in thread From: Paolo Bonzini @ 2019-07-25 15:39 UTC (permalink / raw) To: Michael S. Tsirkin, Sergio Lopez Cc: Peter Maydell, Eduardo Habkost, Maran Wilson, Stefan Hajnoczi, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson On 25/07/19 17:01, Michael S. Tsirkin wrote: >> It would be educational to try to enable ACPI core but disable all >> optional features. A lot of them are select'ed so it's not easy. > Trying with ACPI_REDUCED_HARDWARE_ONLY would also be educational. That's what the NEMU guys experimented with. It's not supported by our DSDT since it uses ACPI GPE, and the reduction in code size is small (about 15000 lines of code in ACPICA, perhaps 100k if you're lucky?). Paolo ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 15:39 ` Paolo Bonzini @ 2019-07-25 17:38 ` Michael S. Tsirkin 2019-07-26 12:46 ` Igor Mammedov 0 siblings, 1 reply; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 17:38 UTC (permalink / raw) To: Paolo Bonzini Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Maran Wilson, Stefan Hajnoczi, QEMU Developers, Gerd Hoffmann, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 05:39:39PM +0200, Paolo Bonzini wrote: > On 25/07/19 17:01, Michael S. Tsirkin wrote: > >> It would be educational to try to enable ACPI core but disable all > >> optional features. > > A lot of them are select'ed so it's not easy. > > > Trying with ACPI_REDUCED_HARDWARE_ONLY would also be educational. > > That's what the NEMU guys experimented with. It's not supported by our > DSDT since it uses ACPI GPE, Well there are two GPE blocks in FADT. We could just switch to these if necesary I think. > and the reduction in code size is small > (about 15000 lines of code in ACPICA, perhaps 100k if you're lucky?). > > Paolo Well ACPI is 150k loc I think, right? linux]$ wc -l `find drivers/acpi/ -name '*.c' `|tail -1 145926 total So 100k wouldn't be too shabby. -- MST ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 17:38 ` Michael S. Tsirkin @ 2019-07-26 12:46 ` Igor Mammedov 0 siblings, 0 replies; 68+ messages in thread From: Igor Mammedov @ 2019-07-26 12:46 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Maran Wilson, Stefan Hajnoczi, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Richard Henderson, Stefano Garzarella On Thu, 25 Jul 2019 13:38:48 -0400 "Michael S. Tsirkin" <mst@redhat.com> wrote: > On Thu, Jul 25, 2019 at 05:39:39PM +0200, Paolo Bonzini wrote: > > On 25/07/19 17:01, Michael S. Tsirkin wrote: > > >> It would be educational to try to enable ACPI core but disable all > > >> optional features. > > > > A lot of them are select'ed so it's not easy. > > > > > Trying with ACPI_REDUCED_HARDWARE_ONLY would also be educational. > > > > That's what the NEMU guys experimented with. It's not supported by our > > DSDT since it uses ACPI GPE, > > Well there are two GPE blocks in FADT. We could just switch to > these if necesary I think. if it's simplistic vm we could build dedicated DSDT (or whole set of tables) for it and use reduced profile like arm-virt machine does (just a newer version of FADT with need flags set). That probably would cut acpi cost on QEMU side. > > and the reduction in code size is small > > (about 15000 lines of code in ACPICA, perhaps 100k if you're lucky?). > > > > Paolo > > Well ACPI is 150k loc I think, right? > > linux]$ wc -l `find drivers/acpi/ -name '*.c' `|tail -1 > 145926 total > > So 100k wouldn't be too shabby. > ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 15:01 ` Michael S. Tsirkin 2019-07-25 15:39 ` Paolo Bonzini @ 2019-07-25 15:49 ` Sergio Lopez 1 sibling, 0 replies; 68+ messages in thread From: Sergio Lopez @ 2019-07-25 15:49 UTC (permalink / raw) To: Michael S. Tsirkin Cc: Peter Maydell, Eduardo Habkost, Maran Wilson, Stefan Hajnoczi, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson [-- Attachment #1: Type: text/plain, Size: 5230 bytes --] Michael S. Tsirkin <mst@redhat.com> writes: > On Thu, Jul 25, 2019 at 10:58:22AM -0400, Michael S. Tsirkin wrote: >> On Thu, Jul 25, 2019 at 04:42:42PM +0200, Sergio Lopez wrote: >> > >> > Paolo Bonzini <pbonzini@redhat.com> writes: >> > >> > > On 25/07/19 15:26, Stefan Hajnoczi wrote: >> > >> The microvm design has a premise and it can be answered definitively >> > >> through performance analysis. >> > >> >> > >> If I had to explain to someone why PCI or ACPI significantly slows >> > >> things down, I couldn't honestly do so. I say significantly because >> > >> PCI init definitely requires more vmexits but can it be a small >> > >> number? For ACPI I have no idea why it would consume significant >> > >> amounts of time. >> > > >> > > My guess is that it's just a lot of code that has to run. :( >> > >> > I think I haven't shared any numbers about ACPI. >> > >> > I don't have details about where exactly the time is spent, but >> > compiling a guest kernel without ACPI decreases the average boot time in >> > ~12ms, and the kernel's unstripped ELF binary size goes down in a >> > whooping ~300KiB. >> >> At least the binary size is hardly surprising. >> >> I'm guessing you built in lots of drivers. >> >> It would be educational to try to enable ACPI core but disable all >> optional features. I just tried disabling everything that menuconfig allowed me to. Saves ~27KiB and doesn't improve boot time. > Trying with ACPI_REDUCED_HARDWARE_ONLY would also be educational. I also tried enabling this one in my original config. It saves ~11.5KiB, and has on impact on boot time either. >> >> > On the other hand, removing ACPI from QEMU decreases its initialization >> > time in ~5ms, and the binary size is ~183KiB smaller. >> >> Yes - ACPI generation uses a ton of allocations and data copies. >> >> Need to play with pre-allocation strategies. Maybe something >> as simple as: >> >> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c >> index f3fdfefcd5..24becc069e 100644 >> --- a/hw/i386/acpi-build.c >> +++ b/hw/i386/acpi-build.c >> @@ -2629,8 +2629,10 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine) >> acpi_get_pci_holes(&pci_hole, &pci_hole64); >> acpi_get_slic_oem(&slic_oem); >> >> +#define DEFAULT_ARRAY_SIZE 16 >> table_offsets = g_array_new(false, true /* clear */, >> - sizeof(uint32_t)); >> + sizeof(uint32_t), >> + DEFAULT_ARRAY_SIZE); >> ACPI_BUILD_DPRINTF("init ACPI tables\n"); >> >> bios_linker_loader_alloc(tables->linker, >> >> will already help a bit. >> >> > >> > IMHO, those are pretty relevant savings on both fronts. >> > >> > >> Until we have this knowledge, the premise of microvm is unproven and >> > >> merging it would be premature because maybe we can get into the same >> > >> ballpark by optimizing existing code. >> > >> >> > >> I'm sorry for being a pain. I actually think the analysis will >> > >> support microvm, but it still needs to be done in order to justify it. >> > > >> > > No, you're not a pain, you're explaining your reasoning and that helps. >> > > >> > > To me *maintainability is the biggest consideration* when introducing a >> > > new feature. "We can do just as well with q35" is a good reason to >> > > deprecate and delete microvm, but not a good reason to reject it now as >> > > long as microvm is good enough in terms of maintainability. Keeping it >> > > out of tree only makes it harder to do this kind of experiment. virtio >> > > 1 seems to be the biggest remaining blocker and I think it'd be a good >> > > thing to have even for the ARM virt machine type. >> > > >> > > FWIW the "PCI tax" seems to be ~10 ms in QEMU, ~10 ms in the firmware(*) >> > > and ~25 ms in the kernel. I must say that's pretty good, but it's still >> > > 30% of the whole boot time and reducing it is the hardest part. If >> > > having microvm in tree can help reducing it, good. Yes, it will get >> > > users, but most likely they will have to support pc or q35 as a fallback >> > > so we could still delete microvm at any time with the due deprecation >> > > period if it turns out to be a failed experiment. >> > > >> > > Whether to use qboot or SeaBIOS for microvm is another story, but it's >> > > an implementation detail as long as the ROM size doesn't change and/or >> > > we don't do versioned machine types. So we can switch from one to the >> > > other at any time; we can also include qboot directly in QEMU's tree, >> > > without going through a submodule, which also reduces the infrastructure >> > > needed (mirrors, etc.) and makes it easier to delete it. >> > > >> > > Paolo >> > > >> > > (*) I measured 15ms in SeaBIOS and 5ms in qboot from the first to the >> > > last write to 0xcf8. I suspect part of qboot's 10ms boot time actually >> > > end up measured as PCI in SeaBIOS, due to different init order, so the >> > > real firmware cost of PAM and PCI initialization should be 5ms for qboot >> > > and 10ms for SeaBIOS. >> > >> >> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 13:26 ` Stefan Hajnoczi 2019-07-25 13:43 ` Paolo Bonzini @ 2019-07-25 13:48 ` Michael S. Tsirkin 1 sibling, 0 replies; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 13:48 UTC (permalink / raw) To: Stefan Hajnoczi Cc: Peter Maydell, Eduardo Habkost, Sergio Lopez, Maran Wilson, QEMU Developers, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Thu, Jul 25, 2019 at 02:26:12PM +0100, Stefan Hajnoczi wrote: > On Thu, Jul 25, 2019 at 1:10 PM Michael S. Tsirkin <mst@redhat.com> wrote: > > On Thu, Jul 25, 2019 at 01:01:29PM +0100, Stefan Hajnoczi wrote: > > > On Thu, Jul 25, 2019 at 12:23 PM Paolo Bonzini <pbonzini@redhat.com> wrote: > > > > On 25/07/19 12:42, Sergio Lopez wrote: > > > > > Peter Maydell <peter.maydell@linaro.org> writes: > > > > >> On Thu, 25 Jul 2019 at 10:59, Michael S. Tsirkin <mst@redhat.com> wrote: > > > > >>> OK so please start with adding virtio 1 support. Guest bits > > > > >>> have been ready for years now. > > > > >> > > > > >> I'd still rather we just used pci virtio. If pci isn't > > > > >> fast enough at startup, do something to make it faster... > > > > > > > > > > Actually, removing PCI (and ACPI), is one of the main ways microvm has > > > > > to reduce not only boot time, but also the exposed surface and the > > > > > general footprint. > > > > > > > > > > I think we need to discuss and settle whether using virtio-mmio (even if > > > > > maintained and upgraded to virtio 1) for a new machine type is > > > > > acceptable or not. Because if it isn't, we should probably just ditch > > > > > the whole microvm idea and move to something else. > > > > > > > > I agree. IMNSHO the reduced attack surface from removing PCI is > > > > (mostly) security theater, however the boot time numbers that Sergio > > > > showed for microvm are quite extreme and I don't think there is any hope > > > > of getting even close with a PCI-based virtual machine. > > > > > > > > So I'd even go a step further: if using virtio-mmio for a new machine > > > > type is not acceptable, we should admit that boot time optimization in > > > > QEMU is basically as good as it can get---low-hanging fruit has been > > > > picked with PVH and mmap is the logical next step, but all that's left > > > > is optimizing the guest or something else. > > > > > > I haven't seen enough analysis to declare boot time optimization done. > > > QEMU startup can be profiled and improved. > > > > Right, and that will always stay the case. > > The microvm design has a premise and it can be answered definitively > through performance analysis. > > If I had to explain to someone why PCI or ACPI significantly slows > things down, I couldn't honestly do so. well with pci each device describes itself. you read this description dword by dword normally. typical description is 20-50 words. if both bios and linux do this, that's twice the amount. bios also uses two vmexits for each access. there's also the resource allocation game. I would say up to 200 exits per device is reasonable. > I say significantly because > PCI init definitely requires more vmexits but can it be a small > number? each bus is scanned for devices. 32 accesses, 256 bus numbers (that's the lastbus thing). Paolo posted a hack just for the root bus but whenever we have a bridge the problem will just re-surface. pcie is actually link based so downstream buses do not need to be scanned outside device 0 unless we see a multifunction bit set. I don't think linux implements this optimization atm. But still the case for internal buses. > For ACPI I have no idea why it would consume significant > amounts of time. me neither. I suspect it's not vmexit related at all. Is ACPI driver in linux just slow? It's not been designed to be on any data path... I'd love to know. I don't feel it's fair to ask someone interested in writing new performant code to necessary optimize old non-performant one. > Until we have this knowledge, the premise of microvm is unproven and > merging it would be premature because maybe we can get into the same > ballpark by optimizing existing code. maybe but who is working on this right now? If it's possible to make PC faster but not enough people know how to do it, and enough people know how to make microvm faster, then it does not matter what's possible in theory. > > I'm sorry for being a pain. I actually think the analysis will > support microvm, but it still needs to be done in order to justify it. > > Stefan At some level it would be great to have someone do detailed performance profiling. But it is a lot of work, which also needs to be justified given there's working code, and it's not bad code at that. Yes speeding up PC would be nice but if everyone's gut feeling is it won't get us what microvm is trying to achieve, why spend cycles making sure? -- MST ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-02 12:11 [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Sergio Lopez ` (5 preceding siblings ...) 2019-07-02 15:23 ` Peter Maydell @ 2019-07-02 15:30 ` no-reply 2019-07-03 9:58 ` Stefan Hajnoczi 2019-08-29 9:02 ` Jing Liu 8 siblings, 0 replies; 68+ messages in thread From: no-reply @ 2019-07-02 15:30 UTC (permalink / raw) To: slp Cc: ehabkost, slp, maran.wilson, mst, qemu-devel, kraxel, pbonzini, sgarzare, rth Patchew URL: https://patchew.org/QEMU/20190702121106.28374-1-slp@redhat.com/ Hi, This series failed the asan build test. Please find the testing commands and their output below. If you have Docker installed, you can probably reproduce it locally. === TEST SCRIPT BEGIN === #!/bin/bash make docker-image-fedora V=1 NETWORK=1 time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1 === TEST SCRIPT END === PASS 2 fdc-test /x86_64/fdc/no_media_on_start PASS 3 fdc-test /x86_64/fdc/read_without_media MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/check-qlit -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="check-qlit" ==7808==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 4 fdc-test /x86_64/fdc/media_change PASS 5 fdc-test /x86_64/fdc/sense_interrupt PASS 6 fdc-test /x86_64/fdc/relative_seek --- PASS 32 test-opts-visitor /visitor/opts/range/beyond PASS 33 test-opts-visitor /visitor/opts/dict/unvisited MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-coroutine -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-coroutine" ==7851==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==7851==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffc0ad0a000; bottom 0x7fa44def8000; size: 0x0057bce12000 (376831025152) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 1 test-coroutine /basic/no-dangling-access --- PASS 11 test-aio /aio/event/wait PASS 12 test-aio /aio/event/flush PASS 13 test-aio /aio/event/wait/no-flush-cb ==7866==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 14 test-aio /aio/timer/schedule PASS 15 test-aio /aio/coroutine/queue-chaining PASS 16 test-aio /aio-gsource/flush --- PASS 28 test-aio /aio-gsource/timer/schedule PASS 13 fdc-test /x86_64/fdc/fuzz-registers MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-aio-multithread -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-aio-multithread" ==7873==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-aio-multithread /aio/multi/lifecycle MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/ide-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="ide-test" ==7890==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 2 test-aio-multithread /aio/multi/schedule PASS 1 ide-test /x86_64/ide/identify PASS 3 test-aio-multithread /aio/multi/mutex/contended ==7901==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 2 ide-test /x86_64/ide/flush ==7912==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 3 ide-test /x86_64/ide/bmdma/simple_rw ==7918==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 4 test-aio-multithread /aio/multi/mutex/handoff PASS 4 ide-test /x86_64/ide/bmdma/trim ==7929==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 5 test-aio-multithread /aio/multi/mutex/mcs PASS 5 ide-test /x86_64/ide/bmdma/short_prdt ==7940==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 6 test-aio-multithread /aio/multi/mutex/pthread MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-throttle -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-throttle" PASS 6 ide-test /x86_64/ide/bmdma/one_sector_short_prdt ==7948==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-throttle /throttle/leak_bucket PASS 2 test-throttle /throttle/compute_wait PASS 3 test-throttle /throttle/init --- PASS 14 test-throttle /throttle/config/max PASS 15 test-throttle /throttle/config/iops_size MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-thread-pool -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-thread-pool" ==7951==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==7955==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-thread-pool /thread-pool/submit PASS 2 test-thread-pool /thread-pool/submit-aio PASS 3 test-thread-pool /thread-pool/submit-co PASS 4 test-thread-pool /thread-pool/submit-many PASS 7 ide-test /x86_64/ide/bmdma/long_prdt ==8027==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8027==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffd45d06000; bottom 0x7f83e57fe000; size: 0x007960508000 (521306931200) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 8 ide-test /x86_64/ide/bmdma/no_busmaster PASS 5 test-thread-pool /thread-pool/cancel PASS 9 ide-test /x86_64/ide/flush/nodev ==8038==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 10 ide-test /x86_64/ide/flush/empty_drive PASS 6 test-thread-pool /thread-pool/cancel-async ==8043==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-hbitmap -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-hbitmap" PASS 1 test-hbitmap /hbitmap/granularity PASS 2 test-hbitmap /hbitmap/size/0 --- PASS 4 test-hbitmap /hbitmap/iter/empty PASS 11 ide-test /x86_64/ide/flush/retry_pci PASS 5 test-hbitmap /hbitmap/iter/partial ==8054==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 6 test-hbitmap /hbitmap/iter/granularity PASS 7 test-hbitmap /hbitmap/iter/iter_and_reset PASS 8 test-hbitmap /hbitmap/get/all --- PASS 14 test-hbitmap /hbitmap/set/twice PASS 15 test-hbitmap /hbitmap/set/overlap PASS 16 test-hbitmap /hbitmap/reset/empty ==8060==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 17 test-hbitmap /hbitmap/reset/general PASS 13 ide-test /x86_64/ide/cdrom/pio PASS 18 test-hbitmap /hbitmap/reset/all --- PASS 28 test-hbitmap /hbitmap/truncate/shrink/medium PASS 29 test-hbitmap /hbitmap/truncate/shrink/large PASS 30 test-hbitmap /hbitmap/meta/zero ==8066==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 14 ide-test /x86_64/ide/cdrom/pio_large ==8072==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 15 ide-test /x86_64/ide/cdrom/dma MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/ahci-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="ahci-test" ==8086==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 31 test-hbitmap /hbitmap/meta/one PASS 32 test-hbitmap /hbitmap/meta/byte PASS 33 test-hbitmap /hbitmap/meta/word PASS 1 ahci-test /x86_64/ahci/sanity ==8092==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 2 ahci-test /x86_64/ahci/pci_spec PASS 34 test-hbitmap /hbitmap/meta/sector PASS 35 test-hbitmap /hbitmap/serialize/align ==8098==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 3 ahci-test /x86_64/ahci/pci_enable ==8104==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 36 test-hbitmap /hbitmap/serialize/basic PASS 37 test-hbitmap /hbitmap/serialize/part PASS 38 test-hbitmap /hbitmap/serialize/zeroes --- PASS 4 ahci-test /x86_64/ahci/hba_spec PASS 43 test-hbitmap /hbitmap/next_dirty_area/next_dirty_area_4 MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-bdrv-drain -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-bdrv-drain" ==8113==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-bdrv-drain /bdrv-drain/nested PASS 2 test-bdrv-drain /bdrv-drain/multiparent PASS 3 test-bdrv-drain /bdrv-drain/set_aio_context --- PASS 20 test-bdrv-drain /bdrv-drain/iothread/drain_subtree PASS 21 test-bdrv-drain /bdrv-drain/blockjob/drain_all PASS 22 test-bdrv-drain /bdrv-drain/blockjob/drain ==8110==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 23 test-bdrv-drain /bdrv-drain/blockjob/drain_subtree PASS 24 test-bdrv-drain /bdrv-drain/blockjob/error/drain_all PASS 25 test-bdrv-drain /bdrv-drain/blockjob/error/drain --- PASS 39 test-bdrv-drain /bdrv-drain/attach/drain PASS 5 ahci-test /x86_64/ahci/hba_enable MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-bdrv-graph-mod -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-bdrv-graph-mod" ==8159==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-bdrv-graph-mod /bdrv-graph-mod/update-perm-tree PASS 2 test-bdrv-graph-mod /bdrv-graph-mod/should-update-child ==8157==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-blockjob -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-blockjob" ==8168==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-blockjob /blockjob/ids PASS 2 test-blockjob /blockjob/cancel/created PASS 3 test-blockjob /blockjob/cancel/running --- PASS 8 test-blockjob /blockjob/cancel/concluded MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-blockjob-txn -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-blockjob-txn" PASS 6 ahci-test /x86_64/ahci/identify ==8174==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-blockjob-txn /single/success PASS 2 test-blockjob-txn /single/failure PASS 3 test-blockjob-txn /single/cancel --- PASS 6 test-blockjob-txn /pair/cancel PASS 7 test-blockjob-txn /pair/fail-cancel-race MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-block-backend -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-block-backend" ==8176==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8181==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-block-backend /block-backend/drain_aio_error PASS 2 test-block-backend /block-backend/drain_all_aio_error MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-block-iothread -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-block-iothread" PASS 7 ahci-test /x86_64/ahci/max ==8190==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-block-iothread /sync-op/pread PASS 2 test-block-iothread /sync-op/pwrite PASS 3 test-block-iothread /sync-op/load_vmstate --- PASS 15 test-block-iothread /propagate/diamond PASS 16 test-block-iothread /propagate/mirror MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-image-locking -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-image-locking" ==8192==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8212==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-image-locking /image-locking/basic PASS 2 test-image-locking /image-locking/set-perm-abort MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-x86-cpuid -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-x86-cpuid" --- PASS 4 test-xbzrle /xbzrle/encode_decode_1_byte PASS 5 test-xbzrle /xbzrle/encode_decode_overflow PASS 8 ahci-test /x86_64/ahci/reset ==8228==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8228==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffc98ab7000; bottom 0x7f6a659fe000; size: 0x0092330b9000 (627921620992) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 6 test-xbzrle /xbzrle/encode_decode --- PASS 133 test-cutils /cutils/strtosz/erange PASS 134 test-cutils /cutils/strtosz/metric MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-shift128 -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-shift128" ==8240==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-shift128 /host-utils/test_lshift PASS 2 test-shift128 /host-utils/test_rshift MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-mul64 -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-mul64" ==8240==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffd869e8000; bottom 0x7f71117fe000; size: 0x008c751ea000 (603260362752) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 1 test-mul64 /host-utils/mulu64 --- PASS 9 test-int128 /int128/int128_gt PASS 10 test-int128 /int128/int128_rshift MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/rcutorture -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="rcutorture" ==8262==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8262==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7fffd5dde000; bottom 0x7f7850bfe000; size: 0x0087851e0000 (582053920768) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 1 rcutorture /rcu/torture/1reader PASS 11 ahci-test /x86_64/ahci/io/pio/lba28/simple/high ==8295==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 2 rcutorture /rcu/torture/10readers MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-rcu-list -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-rcu-list" ==8295==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffe6e62e000; bottom 0x7f1b1fbfe000; size: 0x00e34ea30000 (976276881408) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 12 ahci-test /x86_64/ahci/io/pio/lba28/double/zero ==8308==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-rcu-list /rcu/qlist/single-threaded ==8308==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffc54a9f000; bottom 0x7f5c1bdfe000; size: 0x00a038ca1000 (688147533824) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 2 test-rcu-list /rcu/qlist/short-few PASS 13 ahci-test /x86_64/ahci/io/pio/lba28/double/low ==8341==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8341==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffc4b8b5000; bottom 0x7f782c7fe000; size: 0x00841f0b7000 (567456526336) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 14 ahci-test /x86_64/ahci/io/pio/lba28/double/high ==8347==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8347==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffeb2bc8000; bottom 0x7fd572124000; size: 0x002940aa4000 (177178558464) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 3 test-rcu-list /rcu/qlist/long-many MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-rcu-simpleq -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-rcu-simpleq" PASS 15 ahci-test /x86_64/ahci/io/pio/lba28/long/zero PASS 1 test-rcu-simpleq /rcu/qsimpleq/single-threaded ==8360==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8360==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffc5ebf2000; bottom 0x7f8d6cdfe000; size: 0x006ef1df4000 (476504342528) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 2 test-rcu-simpleq /rcu/qsimpleq/short-few PASS 16 ahci-test /x86_64/ahci/io/pio/lba28/long/low ==8393==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8393==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffc1e90d000; bottom 0x7fef47124000; size: 0x000cd77e9000 (55155003392) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 17 ahci-test /x86_64/ahci/io/pio/lba28/long/high PASS 3 test-rcu-simpleq /rcu/qsimpleq/long-many MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-rcu-tailq -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-rcu-tailq" ==8399==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 18 ahci-test /x86_64/ahci/io/pio/lba28/short/zero PASS 1 test-rcu-tailq /rcu/qtailq/single-threaded ==8412==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 2 test-rcu-tailq /rcu/qtailq/short-few PASS 19 ahci-test /x86_64/ahci/io/pio/lba28/short/low ==8445==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 20 ahci-test /x86_64/ahci/io/pio/lba28/short/high PASS 3 test-rcu-tailq /rcu/qtailq/long-many MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-qdist -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-qdist" ==8451==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-qdist /qdist/none PASS 2 test-qdist /qdist/pr PASS 3 test-qdist /qdist/single/empty --- PASS 7 test-qdist /qdist/binning/expand PASS 8 test-qdist /qdist/binning/shrink MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-qht -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-qht" ==8451==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffcd1fb1000; bottom 0x7f8bae7fe000; size: 0x0071237b3000 (485926580224) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 21 ahci-test /x86_64/ahci/io/pio/lba48/simple/zero ==8466==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8466==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffd0b06f000; bottom 0x7fd8d85fe000; size: 0x002432a71000 (155468632064) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 22 ahci-test /x86_64/ahci/io/pio/lba48/simple/low ==8472==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8472==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffe2c664000; bottom 0x7f11299fe000; size: 0x00ed02c66000 (1017953804288) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 23 ahci-test /x86_64/ahci/io/pio/lba48/simple/high ==8478==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8478==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffdb1ded000; bottom 0x7f37fd1fe000; size: 0x00c5b4bef000 (849140969472) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 24 ahci-test /x86_64/ahci/io/pio/lba48/double/zero ==8484==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8484==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffc4f4ff000; bottom 0x7ff9595fe000; size: 0x0002f5f01000 (12716085248) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 25 ahci-test /x86_64/ahci/io/pio/lba48/double/low ==8490==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8490==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffdb07bb000; bottom 0x7ffbc8dfe000; size: 0x0001e79bd000 (8180715520) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 26 ahci-test /x86_64/ahci/io/pio/lba48/double/high ==8496==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8496==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7fff207e2000; bottom 0x7fb6ffdfe000; size: 0x0048209e4000 (309784887296) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 27 ahci-test /x86_64/ahci/io/pio/lba48/long/zero ==8502==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8502==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffc7d92d000; bottom 0x7f0b65b7c000; size: 0x00f117db1000 (1035487350784) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 28 ahci-test /x86_64/ahci/io/pio/lba48/long/low ==8508==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8508==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffe6de73000; bottom 0x7fc79a9fe000; size: 0x0036d3475000 (235472900096) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 29 ahci-test /x86_64/ahci/io/pio/lba48/long/high ==8514==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 30 ahci-test /x86_64/ahci/io/pio/lba48/short/zero ==8520==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-qht /qht/mode/default PASS 31 ahci-test /x86_64/ahci/io/pio/lba48/short/low PASS 2 test-qht /qht/mode/resize MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-qht-par -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-qht-par" ==8526==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 32 ahci-test /x86_64/ahci/io/pio/lba48/short/high PASS 1 test-qht-par /qht/parallel/2threads-0%updates-1s ==8542==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 33 ahci-test /x86_64/ahci/io/dma/lba28/fragmented PASS 2 test-qht-par /qht/parallel/2threads-20%updates-1s MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-bitops -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-bitops" ==8555==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-bitops /bitops/sextract32 PASS 2 test-bitops /bitops/sextract64 PASS 3 test-bitops /bitops/half_shuffle32 --- PASS 1 check-qom-interface /qom/interface/direct_impl PASS 2 check-qom-interface /qom/interface/intermediate_impl MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/check-qom-proplist -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="check-qom-proplist" ==8580==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 check-qom-proplist /qom/proplist/createlist PASS 2 check-qom-proplist /qom/proplist/createv PASS 3 check-qom-proplist /qom/proplist/createcmdline --- PASS 4 test-write-threshold /write-threshold/not-trigger PASS 5 test-write-threshold /write-threshold/trigger MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-crypto-hash -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-crypto-hash" ==8607==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-crypto-hash /crypto/hash/iov PASS 2 test-crypto-hash /crypto/hash/alloc PASS 3 test-crypto-hash /crypto/hash/prealloc --- PASS 15 test-crypto-secret /crypto/secret/crypt/missingiv PASS 16 test-crypto-secret /crypto/secret/crypt/badiv MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-crypto-tlscredsx509 -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-crypto-tlscredsx509" ==8630==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 37 ahci-test /x86_64/ahci/io/dma/lba28/simple/high PASS 1 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/perfectserver PASS 2 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/perfectclient PASS 3 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/goodca1 ==8645==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 4 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/goodca2 PASS 38 ahci-test /x86_64/ahci/io/dma/lba28/double/zero PASS 5 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/goodca3 PASS 6 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/badca1 PASS 7 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/badca2 PASS 8 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/badca3 ==8651==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 9 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/goodserver1 PASS 39 ahci-test /x86_64/ahci/io/dma/lba28/double/low ==8657==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 40 ahci-test /x86_64/ahci/io/dma/lba28/double/high PASS 10 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/goodserver2 ==8663==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 11 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/goodserver3 PASS 41 ahci-test /x86_64/ahci/io/dma/lba28/long/zero PASS 12 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/goodserver4 ==8669==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 42 ahci-test /x86_64/ahci/io/dma/lba28/long/low PASS 13 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/goodserver5 PASS 14 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/goodserver6 --- PASS 32 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/inactive1 PASS 33 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/inactive2 PASS 34 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/inactive3 ==8675==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 35 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/chain1 PASS 36 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/chain2 PASS 37 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/missingca --- PASS 39 test-crypto-tlscredsx509 /qcrypto/tlscredsx509/missingclient PASS 43 ahci-test /x86_64/ahci/io/dma/lba28/long/high MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-crypto-tlssession -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-crypto-tlssession" ==8682==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-crypto-tlssession /qcrypto/tlssession/psk PASS 44 ahci-test /x86_64/ahci/io/dma/lba28/short/zero PASS 2 test-crypto-tlssession /qcrypto/tlssession/basicca ==8692==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 3 test-crypto-tlssession /qcrypto/tlssession/differentca PASS 45 ahci-test /x86_64/ahci/io/dma/lba28/short/low ==8698==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 4 test-crypto-tlssession /qcrypto/tlssession/altname1 PASS 46 ahci-test /x86_64/ahci/io/dma/lba28/short/high ==8704==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 5 test-crypto-tlssession /qcrypto/tlssession/altname2 PASS 47 ahci-test /x86_64/ahci/io/dma/lba48/simple/zero PASS 6 test-crypto-tlssession /qcrypto/tlssession/altname3 PASS 7 test-crypto-tlssession /qcrypto/tlssession/altname4 ==8710==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 8 test-crypto-tlssession /qcrypto/tlssession/altname5 PASS 48 ahci-test /x86_64/ahci/io/dma/lba48/simple/low PASS 9 test-crypto-tlssession /qcrypto/tlssession/altname6 ==8716==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 49 ahci-test /x86_64/ahci/io/dma/lba48/simple/high PASS 10 test-crypto-tlssession /qcrypto/tlssession/wildcard1 ==8722==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 11 test-crypto-tlssession /qcrypto/tlssession/wildcard2 PASS 12 test-crypto-tlssession /qcrypto/tlssession/wildcard3 PASS 50 ahci-test /x86_64/ahci/io/dma/lba48/double/zero ==8729==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 51 ahci-test /x86_64/ahci/io/dma/lba48/double/low ==8735==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 13 test-crypto-tlssession /qcrypto/tlssession/wildcard4 PASS 52 ahci-test /x86_64/ahci/io/dma/lba48/double/high ==8741==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 14 test-crypto-tlssession /qcrypto/tlssession/wildcard5 PASS 15 test-crypto-tlssession /qcrypto/tlssession/wildcard6 PASS 16 test-crypto-tlssession /qcrypto/tlssession/cachain PASS 53 ahci-test /x86_64/ahci/io/dma/lba48/long/zero MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-qga -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-qga" ==8748==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-qga /qga/sync-delimited PASS 2 test-qga /qga/sync PASS 3 test-qga /qga/ping --- PASS 16 test-qga /qga/invalid-args PASS 17 test-qga /qga/fsfreeze-status PASS 54 ahci-test /x86_64/ahci/io/dma/lba48/long/low ==8760==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 18 test-qga /qga/blacklist PASS 19 test-qga /qga/config PASS 20 test-qga /qga/guest-exec PASS 21 test-qga /qga/guest-exec-invalid PASS 55 ahci-test /x86_64/ahci/io/dma/lba48/long/high ==8773==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 22 test-qga /qga/guest-get-osinfo PASS 23 test-qga /qga/guest-get-host-name PASS 24 test-qga /qga/guest-get-timezone --- PASS 56 ahci-test /x86_64/ahci/io/dma/lba48/short/zero PASS 1 test-util-filemonitor /util/filemonitor MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-util-sockets -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-util-sockets" ==8790==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-util-sockets /util/socket/is-socket/bad PASS 2 test-util-sockets /util/socket/is-socket/good PASS 3 test-util-sockets /socket/fd-pass/name/good --- PASS 4 test-authz-listfile /auth/list/explicit/deny PASS 5 test-authz-listfile /auth/list/explicit/allow MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-io-task -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-io-task" ==8818==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-io-task /crypto/task/complete PASS 2 test-io-task /crypto/task/datafree PASS 3 test-io-task /crypto/task/failure --- PASS 5 test-io-channel-file /io/channel/pipe/async MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-io-channel-tls -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-io-channel-tls" PASS 58 ahci-test /x86_64/ahci/io/dma/lba48/short/high ==8885==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-io-channel-tls /qio/channel/tls/basic MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-io-channel-command -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-io-channel-command" PASS 1 test-io-channel-command /io/channel/command/fifo/sync --- PASS 17 test-crypto-pbkdf /crypto/pbkdf/nonrfc/sha384/iter1200 PASS 18 test-crypto-pbkdf /crypto/pbkdf/nonrfc/ripemd160/iter1200 MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-crypto-ivgen -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-crypto-ivgen" ==8906==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-crypto-ivgen /crypto/ivgen/plain/1 PASS 2 test-crypto-ivgen /crypto/ivgen/plain/1f2e3d4c PASS 3 test-crypto-ivgen /crypto/ivgen/plain/1f2e3d4c5b6a7988 --- PASS 1 test-logging /logging/parse_range PASS 2 test-logging /logging/parse_path MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-replication -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-replication" ==8947==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8945==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 test-replication /replication/primary/read PASS 2 test-replication /replication/primary/write PASS 61 ahci-test /x86_64/ahci/flush/simple ==8956==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 3 test-replication /replication/primary/start PASS 4 test-replication /replication/primary/stop PASS 5 test-replication /replication/primary/do_checkpoint PASS 6 test-replication /replication/primary/get_error_all PASS 62 ahci-test /x86_64/ahci/flush/retry ==8962==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 7 test-replication /replication/secondary/read ==8967==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 8 test-replication /replication/secondary/write PASS 63 ahci-test /x86_64/ahci/flush/migrate ==8976==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8981==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==8947==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffc17022000; bottom 0x7fa4f2cfc000; size: 0x005724326000 (374269435904) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 9 test-replication /replication/secondary/start PASS 64 ahci-test /x86_64/ahci/migrate/sanity ==9008==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==9013==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 10 test-replication /replication/secondary/stop PASS 65 ahci-test /x86_64/ahci/migrate/dma/simple ==9022==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==9027==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 11 test-replication /replication/secondary/do_checkpoint PASS 12 test-replication /replication/secondary/get_error_all MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} tests/test-bufferiszero -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-bufferiszero" PASS 66 ahci-test /x86_64/ahci/migrate/dma/halted ==9040==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==9045==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 67 ahci-test /x86_64/ahci/migrate/ncq/simple ==9054==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==9059==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 68 ahci-test /x86_64/ahci/migrate/ncq/halted ==9068==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 69 ahci-test /x86_64/ahci/cdrom/eject ==9073==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 70 ahci-test /x86_64/ahci/cdrom/dma/single ==9079==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 71 ahci-test /x86_64/ahci/cdrom/dma/multi ==9085==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 72 ahci-test /x86_64/ahci/cdrom/pio/single ==9091==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! ==9091==WARNING: ASan is ignoring requested __asan_handle_no_return: stack top: 0x7ffdd7f93000; bottom 0x7f75251fe000; size: 0x0088b2d95000 (587116138496) False positive error reports may follow For details see https://github.com/google/sanitizers/issues/189 PASS 73 ahci-test /x86_64/ahci/cdrom/pio/multi ==9097==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 74 ahci-test /x86_64/ahci/cdrom/pio/bcl MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/hd-geo-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="hd-geo-test" PASS 1 hd-geo-test /x86_64/hd-geo/ide/none ==9111==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 2 hd-geo-test /x86_64/hd-geo/ide/drive/cd_0 ==9117==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 3 hd-geo-test /x86_64/hd-geo/ide/drive/mbr/blank ==9123==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 4 hd-geo-test /x86_64/hd-geo/ide/drive/mbr/lba ==9129==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 5 hd-geo-test /x86_64/hd-geo/ide/drive/mbr/chs ==9135==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 6 hd-geo-test /x86_64/hd-geo/ide/device/mbr/blank ==9141==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 7 hd-geo-test /x86_64/hd-geo/ide/device/mbr/lba ==9147==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 8 hd-geo-test /x86_64/hd-geo/ide/device/mbr/chs ==9153==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 9 hd-geo-test /x86_64/hd-geo/ide/device/user/chs ==9158==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 10 hd-geo-test /x86_64/hd-geo/ide/device/user/chst MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/boot-order-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="boot-order-test" PASS 1 test-bufferiszero /cutils/bufferiszero --- Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9243==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 bios-tables-test /x86_64/acpi/piix4 Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9249==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 2 bios-tables-test /x86_64/acpi/q35 Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9255==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 3 bios-tables-test /x86_64/acpi/piix4/bridge Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9261==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 4 bios-tables-test /x86_64/acpi/piix4/ipmi Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9267==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 5 bios-tables-test /x86_64/acpi/piix4/cpuhp Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9274==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 6 bios-tables-test /x86_64/acpi/piix4/memhp Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9280==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 7 bios-tables-test /x86_64/acpi/piix4/numamem Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9286==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 8 bios-tables-test /x86_64/acpi/piix4/dimmpxm Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9295==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 9 bios-tables-test /x86_64/acpi/q35/bridge Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9301==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 10 bios-tables-test /x86_64/acpi/q35/mmio64 Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9307==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 11 bios-tables-test /x86_64/acpi/q35/ipmi Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9313==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 12 bios-tables-test /x86_64/acpi/q35/cpuhp Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9320==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 13 bios-tables-test /x86_64/acpi/q35/memhp Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9326==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 14 bios-tables-test /x86_64/acpi/q35/numamem Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9332==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 15 bios-tables-test /x86_64/acpi/q35/dimmpxm MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/boot-serial-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="boot-serial-test" PASS 1 boot-serial-test /x86_64/boot-serial/isapc --- PASS 1 i440fx-test /x86_64/i440fx/defaults PASS 2 i440fx-test /x86_64/i440fx/pam PASS 3 i440fx-test /x86_64/i440fx/firmware/bios ==9416==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 4 i440fx-test /x86_64/i440fx/firmware/pflash MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/fw_cfg-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="fw_cfg-test" PASS 1 fw_cfg-test /x86_64/fw_cfg/signature --- MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/drive_del-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="drive_del-test" PASS 1 drive_del-test /x86_64/drive_del/without-dev PASS 2 drive_del-test /x86_64/drive_del/after_failed_device_add ==9504==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 3 drive_del-test /x86_64/blockdev/drive_del_device_del MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/wdt_ib700-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="wdt_ib700-test" PASS 1 wdt_ib700-test /x86_64/wdt_ib700/pause --- PASS 1 usb-hcd-uhci-test /x86_64/uhci/pci/init PASS 2 usb-hcd-uhci-test /x86_64/uhci/pci/port1 PASS 3 usb-hcd-uhci-test /x86_64/uhci/pci/hotplug ==9699==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 4 usb-hcd-uhci-test /x86_64/uhci/pci/hotplug/usb-storage MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/usb-hcd-xhci-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="usb-hcd-xhci-test" PASS 1 usb-hcd-xhci-test /x86_64/xhci/pci/init PASS 2 usb-hcd-xhci-test /x86_64/xhci/pci/hotplug ==9708==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 3 usb-hcd-xhci-test /x86_64/xhci/pci/hotplug/usb-uas PASS 4 usb-hcd-xhci-test /x86_64/xhci/pci/hotplug/usb-ccid MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/cpu-plug-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="cpu-plug-test" --- Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9814==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 1 vmgenid-test /x86_64/vmgenid/vmgenid/set-guid Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9820==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 2 vmgenid-test /x86_64/vmgenid/vmgenid/set-guid-auto Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9826==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 3 vmgenid-test /x86_64/vmgenid/vmgenid/query-monitor MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/tpm-crb-swtpm-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="tpm-crb-swtpm-test" SKIP 1 tpm-crb-swtpm-test /x86_64/tpm/crb-swtpm/test # SKIP swtpm not in PATH or missing --tpm2 support --- Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9931==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9936==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 3 migration-test /x86_64/migration/fd_proto Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9944==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9949==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 4 migration-test /x86_64/migration/postcopy/unix PASS 5 migration-test /x86_64/migration/postcopy/recovery Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9979==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9984==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 6 migration-test /x86_64/migration/precopy/unix Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9993==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==9998==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 7 migration-test /x86_64/migration/precopy/tcp Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==10007==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! Could not access KVM kernel module: No such file or directory qemu-system-x86_64: failed to initialize KVM: No such file or directory qemu-system-x86_64: Back to tcg accelerator ==10012==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 8 migration-test /x86_64/migration/xbzrle/unix MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/test-x86-cpuid-compat -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="test-x86-cpuid-compat" PASS 1 test-x86-cpuid-compat /x86/cpuid/parsing-plus-minus --- PASS 6 numa-test /x86_64/numa/pc/dynamic/cpu MALLOC_PERTURB_=${MALLOC_PERTURB_:-$(( ${RANDOM:-0} % 255 + 1))} QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 QTEST_QEMU_IMG=qemu-img tests/qmp-test -m=quick -k --tap < /dev/null | ./scripts/tap-driver.pl --test-name="qmp-test" PASS 1 qmp-test /x86_64/qmp/protocol ==10341==WARNING: ASan doesn't fully support makecontext/swapcontext functions and may produce false positives in some cases! PASS 2 qmp-test /x86_64/qmp/oob PASS 3 qmp-test /x86_64/qmp/preconfig PASS 4 qmp-test /x86_64/qmp/missing-any-arg --- PASS 5 device-introspect-test /x86_64/device/introspect/abstract-interfaces ================================================================= ==10589==ERROR: LeakSanitizer: detected memory leaks Direct leak of 32 byte(s) in 1 object(s) allocated from: #0 0x561de4fecb2e in calloc (/tmp/qemu-test/build/x86_64-softmmu/qemu-system-x86_64+0x19fdb2e) --- SUMMARY: AddressSanitizer: 64 byte(s) leaked in 2 allocation(s). /tmp/qemu-test/src/tests/libqtest.c:137: kill_qemu() tried to terminate QEMU process but encountered exit status 1 ERROR - too few tests run (expected 6, got 5) make: *** [/tmp/qemu-test/src/tests/Makefile.include:894: check-qtest-x86_64] Error 1 make: *** Waiting for unfinished jobs.... Traceback (most recent call last): The full log is available at http://patchew.org/logs/20190702121106.28374-1-slp@redhat.com/testing.asan/?type=message. --- Email generated automatically by Patchew [https://patchew.org/]. Please send your feedback to patchew-devel@redhat.com ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-02 12:11 [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Sergio Lopez ` (6 preceding siblings ...) 2019-07-02 15:30 ` no-reply @ 2019-07-03 9:58 ` Stefan Hajnoczi 2019-07-18 15:21 ` Sergio Lopez 2019-08-29 9:02 ` Jing Liu 8 siblings, 1 reply; 68+ messages in thread From: Stefan Hajnoczi @ 2019-07-03 9:58 UTC (permalink / raw) To: Sergio Lopez Cc: ehabkost, maran.wilson, mst, qemu-devel, kraxel, pbonzini, sgarzare, rth [-- Attachment #1: Type: text/plain, Size: 3602 bytes --] On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote: > Microvm is a machine type inspired by both NEMU and Firecracker, and > constructed after the machine model implemented by the latter. > > It's main purpose is providing users a KVM-only machine type with fast > boot times, minimal attack surface (measured as the number of IO ports > and MMIO regions exposed to the Guest) and small footprint (specially > when combined with the ongoing QEMU modularization effort). > > Normally, other than the device support provided by KVM itself, > microvm only supports virtio-mmio devices. Microvm also includes a > legacy mode, which adds an ISA bus with a 16550A serial port, useful > for being able to see the early boot kernel messages. > > Microvm only supports booting PVH-enabled Linux ELF images. Booting > other PVH-enabled kernels may be possible, but due to the lack of ACPI > and firmware, we're relying on the command line for specifying the > location of the virtio-mmio transports. If there's an interest on > using this machine type with other kernels, we'll try to find some > kind of middle ground solution. > > This is the list of the exposed IO ports and MMIO regions when running > in non-legacy mode: > > address-space: memory > 00000000d0000000-00000000d00001ff (prio 0, i/o): virtio-mmio > 00000000d0000200-00000000d00003ff (prio 0, i/o): virtio-mmio > 00000000d0000400-00000000d00005ff (prio 0, i/o): virtio-mmio > 00000000d0000600-00000000d00007ff (prio 0, i/o): virtio-mmio > 00000000d0000800-00000000d00009ff (prio 0, i/o): virtio-mmio > 00000000d0000a00-00000000d0000bff (prio 0, i/o): virtio-mmio > 00000000d0000c00-00000000d0000dff (prio 0, i/o): virtio-mmio > 00000000d0000e00-00000000d0000fff (prio 0, i/o): virtio-mmio > 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi > > address-space: I/O > 0000000000000000-000000000000ffff (prio 0, i/o): io > 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic > 0000000000000040-0000000000000043 (prio 0, i/o): kvm-pit > 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic > 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic > 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr > 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr > > A QEMU instance with the microvm machine type can be invoked this way: > > - Normal mode: > > qemu-system-x86_64 -M microvm -m 512m -smp 2 \ > -kernel vmlinux -append "console=hvc0 root=/dev/vda" \ > -nodefaults -no-user-config \ > -chardev pty,id=virtiocon0,server \ > -device virtio-serial-device \ > -device virtconsole,chardev=virtiocon0 \ > -drive id=test,file=test.img,format=raw,if=none \ > -device virtio-blk-device,drive=test \ > -netdev tap,id=tap0,script=no,downscript=no \ > -device virtio-net-device,netdev=tap0 > > - Legacy mode: > > qemu-system-x86_64 -M microvm,legacy -m 512m -smp 2 \ > -kernel vmlinux -append "console=ttyS0 root=/dev/vda" \ > -nodefaults -no-user-config \ > -drive id=test,file=test.img,format=raw,if=none \ > -device virtio-blk-device,drive=test \ > -netdev tap,id=tap0,script=no,downscript=no \ > -device virtio-net-device,netdev=tap0 \ > -serial stdio Please post metrics that compare this against a minimal Q35. With qboot it was later found that SeaBIOS can achieve comparable boot times, so it wasn't worth maintaining qboot. Data is needed to show that microvm is really a significant improvement over a minimal Q35. Stefan [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-03 9:58 ` Stefan Hajnoczi @ 2019-07-18 15:21 ` Sergio Lopez 2019-07-19 10:29 ` Stefan Hajnoczi 0 siblings, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-07-18 15:21 UTC (permalink / raw) To: Stefan Hajnoczi Cc: ehabkost, maran.wilson, mst, qemu-devel, kraxel, pbonzini, sgarzare, rth [-- Attachment #1: Type: text/plain, Size: 30017 bytes --] Stefan Hajnoczi <stefanha@gmail.com> writes: > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote: >> Microvm is a machine type inspired by both NEMU and Firecracker, and >> constructed after the machine model implemented by the latter. >> >> It's main purpose is providing users a KVM-only machine type with fast >> boot times, minimal attack surface (measured as the number of IO ports >> and MMIO regions exposed to the Guest) and small footprint (specially >> when combined with the ongoing QEMU modularization effort). >> >> Normally, other than the device support provided by KVM itself, >> microvm only supports virtio-mmio devices. Microvm also includes a >> legacy mode, which adds an ISA bus with a 16550A serial port, useful >> for being able to see the early boot kernel messages. >> >> Microvm only supports booting PVH-enabled Linux ELF images. Booting >> other PVH-enabled kernels may be possible, but due to the lack of ACPI >> and firmware, we're relying on the command line for specifying the >> location of the virtio-mmio transports. If there's an interest on >> using this machine type with other kernels, we'll try to find some >> kind of middle ground solution. >> >> This is the list of the exposed IO ports and MMIO regions when running >> in non-legacy mode: >> >> address-space: memory >> 00000000d0000000-00000000d00001ff (prio 0, i/o): virtio-mmio >> 00000000d0000200-00000000d00003ff (prio 0, i/o): virtio-mmio >> 00000000d0000400-00000000d00005ff (prio 0, i/o): virtio-mmio >> 00000000d0000600-00000000d00007ff (prio 0, i/o): virtio-mmio >> 00000000d0000800-00000000d00009ff (prio 0, i/o): virtio-mmio >> 00000000d0000a00-00000000d0000bff (prio 0, i/o): virtio-mmio >> 00000000d0000c00-00000000d0000dff (prio 0, i/o): virtio-mmio >> 00000000d0000e00-00000000d0000fff (prio 0, i/o): virtio-mmio >> 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi >> >> address-space: I/O >> 0000000000000000-000000000000ffff (prio 0, i/o): io >> 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic >> 0000000000000040-0000000000000043 (prio 0, i/o): kvm-pit >> 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic >> 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic >> 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr >> 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr >> >> A QEMU instance with the microvm machine type can be invoked this way: >> >> - Normal mode: >> >> qemu-system-x86_64 -M microvm -m 512m -smp 2 \ >> -kernel vmlinux -append "console=hvc0 root=/dev/vda" \ >> -nodefaults -no-user-config \ >> -chardev pty,id=virtiocon0,server \ >> -device virtio-serial-device \ >> -device virtconsole,chardev=virtiocon0 \ >> -drive id=test,file=test.img,format=raw,if=none \ >> -device virtio-blk-device,drive=test \ >> -netdev tap,id=tap0,script=no,downscript=no \ >> -device virtio-net-device,netdev=tap0 >> >> - Legacy mode: >> >> qemu-system-x86_64 -M microvm,legacy -m 512m -smp 2 \ >> -kernel vmlinux -append "console=ttyS0 root=/dev/vda" \ >> -nodefaults -no-user-config \ >> -drive id=test,file=test.img,format=raw,if=none \ >> -device virtio-blk-device,drive=test \ >> -netdev tap,id=tap0,script=no,downscript=no \ >> -device virtio-net-device,netdev=tap0 \ >> -serial stdio > > Please post metrics that compare this against a minimal Q35. > > With qboot it was later found that SeaBIOS can achieve comparable boot > times, so it wasn't worth maintaining qboot. > > Data is needed to show that microvm is really a significant improvement > over a minimal Q35. I've just ran some numbers using Stefano Garzarella's qemu-boot-time scripts [1] on a server with 2xIntel Xeon Silver 4114 2.20GHz, using the upstream QEMU (474f3938d79ab36b9231c9ad3b5a9314c2aeacde) built with minimal features [2]. The VM boots a minimal kernel [3] without initrd, using a kata container image as root via virtio-blk (though this isn't really relevant, as we're just taking measurements until the kernel is about to exec init). To try to make the comparison as fair as possible, I've used a minimal q35 machine with as few devices as possible. Disabling HPET and PIT at the same time caused the kernel to get stuck on boot, so I ran two iterations, one without HPET and the other without PIT: ----------------- | Q35 with HPET | ----------------- Command line: ./x86_64-softmmu/qemu-system-x86_64 -m 512m -enable-kvm -M q35,smbus=off,nvdimm=off,pit=off,vmport=off,sata=off,usb=off,graphics=off -kernel /root/src/images/vmlinux-5.2 -append "console=hvc0 reboot=k panic=1 root=/dev/vda quiet" -smp 1 -nodefaults -no-user-config -chardev pty,id=virtiocon0,server -device virtio-serial -device virtconsole,chardev=virtiocon0 -drive id=test,file=/root/src/images/hello-rootfs.ext4,format=raw,if=none -device virtio-blk,drive=test Average boot times after 10 consecutive runs: qemu_init_end: 77.637936 linux_start_kernel: 117.082526 (+39.44459) linux_start_user: 364.629972 (+247.547446) Memory tree: address-space: memory 0000000000000000-ffffffffffffffff (prio 0, i/o): system 0000000000000000-000000001fffffff (prio 0, i/o): alias ram-below-4g @pc.ram 0000000000000000-000000001fffffff 0000000000000000-ffffffffffffffff (prio -1, i/o): pci 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom 00000000000e0000-00000000000fffff (prio 1, i/o): alias isa-bios @pc.bios 0000000000020000-000000000003ffff 00000000febf4000-00000000febf7fff (prio 1, i/o): virtio-pci 00000000febf4000-00000000febf4fff (prio 0, i/o): virtio-pci-common 00000000febf5000-00000000febf5fff (prio 0, i/o): virtio-pci-isr 00000000febf6000-00000000febf6fff (prio 0, i/o): virtio-pci-device 00000000febf7000-00000000febf7fff (prio 0, i/o): virtio-pci-notify 00000000febf8000-00000000febfbfff (prio 1, i/o): virtio-pci 00000000febf8000-00000000febf8fff (prio 0, i/o): virtio-pci-common 00000000febf9000-00000000febf9fff (prio 0, i/o): virtio-pci-isr 00000000febfa000-00000000febfafff (prio 0, i/o): virtio-pci-device 00000000febfb000-00000000febfbfff (prio 0, i/o): virtio-pci-notify 00000000febfe000-00000000febfefff (prio 1, i/o): virtio-serial-pci-msix 00000000febfe000-00000000febfe01f (prio 0, i/o): msix-table 00000000febfe800-00000000febfe807 (prio 0, i/o): msix-pba 00000000febff000-00000000febfffff (prio 1, i/o): virtio-blk-pci-msix 00000000febff000-00000000febff01f (prio 0, i/o): msix-table 00000000febff800-00000000febff807 (prio 0, i/o): msix-pba 00000000fffc0000-00000000ffffffff (prio 0, rom): pc.bios 00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff 00000000000c0000-00000000000c2fff (prio 1000, i/o): alias kvmvapic-rom @pc.ram 00000000000c0000-00000000000c2fff 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c0000-00000000000c3fff [disabled] 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c0000-00000000000c3fff [disabled] 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c0000-00000000000c3fff 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff [disabled] 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c4000-00000000000c7fff [disabled] 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c4000-00000000000c7fff [disabled] 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c4000-00000000000c7fff 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff [disabled] 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c8000-00000000000cbfff [disabled] 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c8000-00000000000cbfff [disabled] 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c8000-00000000000cbfff 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pci 00000000000c8000-00000000000cbfff [disabled] 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000cc000-00000000000cffff [disabled] 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000cc000-00000000000cffff [disabled] 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000cc000-00000000000cffff 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pci 00000000000cc000-00000000000cffff [disabled] 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d0000-00000000000d3fff [disabled] 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d0000-00000000000d3fff [disabled] 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d0000-00000000000d3fff 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pci 00000000000d0000-00000000000d3fff [disabled] 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d4000-00000000000d7fff [disabled] 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d4000-00000000000d7fff [disabled] 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d4000-00000000000d7fff 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pci 00000000000d4000-00000000000d7fff [disabled] 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d8000-00000000000dbfff [disabled] 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d8000-00000000000dbfff [disabled] 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d8000-00000000000dbfff 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pci 00000000000d8000-00000000000dbfff [disabled] 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000dc000-00000000000dffff [disabled] 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000dc000-00000000000dffff [disabled] 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000dc000-00000000000dffff 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pci 00000000000dc000-00000000000dffff [disabled] 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e0000-00000000000e3fff [disabled] 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e0000-00000000000e3fff [disabled] 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e0000-00000000000e3fff 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pci 00000000000e0000-00000000000e3fff [disabled] 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e4000-00000000000e7fff [disabled] 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e4000-00000000000e7fff [disabled] 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e4000-00000000000e7fff 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pci 00000000000e4000-00000000000e7fff [disabled] 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e8000-00000000000ebfff 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e8000-00000000000ebfff [disabled] 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e8000-00000000000ebfff [disabled] 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pci 00000000000e8000-00000000000ebfff [disabled] 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-ram @pc.ram 00000000000ec000-00000000000effff 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pc.ram 00000000000ec000-00000000000effff [disabled] 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-rom @pc.ram 00000000000ec000-00000000000effff [disabled] 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pci 00000000000ec000-00000000000effff [disabled] 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000f0000-00000000000fffff [disabled] 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000f0000-00000000000fffff [disabled] 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000f0000-00000000000fffff 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pci 00000000000f0000-00000000000fffff [disabled] 0000000020000000-0000000020000000 (prio 1, i/o): tseg-blackhole [disabled] 00000000b0000000-00000000bfffffff (prio 0, i/o): pcie-mmcfg-mmio 00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic 00000000fed00000-00000000fed003ff (prio 0, i/o): hpet 00000000fed1c000-00000000fed1ffff (prio 1, i/o): lpc-rcrb-mmio 00000000feda0000-00000000fedbffff (prio 1, i/o): alias smram-open-high @pc.ram 00000000000a0000-00000000000bffff [disabled] 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi address-space: I/O 0000000000000000-000000000000ffff (prio 0, i/o): io 0000000000000000-0000000000000007 (prio 0, i/o): dma-chan 0000000000000008-000000000000000f (prio 0, i/o): dma-cont 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic 0000000000000060-0000000000000060 (prio 0, i/o): i8042-data 0000000000000064-0000000000000064 (prio 0, i/o): i8042-cmd 0000000000000070-0000000000000071 (prio 0, i/o): rtc 0000000000000070-0000000000000070 (prio 0, i/o): rtc-index 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic 0000000000000080-0000000000000080 (prio 0, i/o): ioport80 0000000000000081-0000000000000083 (prio 0, i/o): dma-page 0000000000000087-0000000000000087 (prio 0, i/o): dma-page 0000000000000089-000000000000008b (prio 0, i/o): dma-page 000000000000008f-000000000000008f (prio 0, i/o): dma-page 0000000000000092-0000000000000092 (prio 0, i/o): port92 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic 00000000000000b2-00000000000000b3 (prio 0, i/o): apm-io 00000000000000c0-00000000000000cf (prio 0, i/o): dma-chan 00000000000000d0-00000000000000df (prio 0, i/o): dma-cont 00000000000000f0-00000000000000f0 (prio 0, i/o): ioportF0 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr 0000000000000510-0000000000000511 (prio 0, i/o): fwcfg 0000000000000514-000000000000051b (prio 0, i/o): fwcfg.dma 0000000000000600-000000000000067f (prio 0, i/o): ich9-pm 0000000000000600-0000000000000603 (prio 0, i/o): acpi-evt 0000000000000604-0000000000000605 (prio 0, i/o): acpi-cnt 0000000000000608-000000000000060b (prio 0, i/o): acpi-tmr 0000000000000620-000000000000062f (prio 0, i/o): acpi-gpe0 0000000000000630-0000000000000637 (prio 0, i/o): acpi-smi 0000000000000660-000000000000067f (prio 0, i/o): sm-tco 0000000000000cd8-0000000000000ce3 (prio 0, i/o): acpi-mem-hotplug 0000000000000cf8-0000000000000cfb (prio 0, i/o): pci-conf-idx 0000000000000cf9-0000000000000cf9 (prio 1, i/o): lpc-reset-control 0000000000000cfc-0000000000000cff (prio 0, i/o): pci-conf-data 000000000000c000-000000000000c07f (prio 1, i/o): virtio-pci 000000000000c080-000000000000c0bf (prio 1, i/o): virtio-pci ---------------- | Q35 with PIT | ---------------- Command line: ./x86_64-softmmu/qemu-system-x86_64 -m 512m -enable-kvm -M q35,smbus=off,nvdimm=off,pit=on,vmport=off,sata=off,usb=off,graphics=off -no-hpet -kernel /root/src/images/vmlinux-5.2 -append "console=hvc0 reboot=k panic=1 root=/dev/vda quiet" -smp 1 -nodefaults -no-user-config -chardev pty,id=virtiocon0,server -device virtio-serial -device virtconsole,chardev=virtiocon0 -drive id=test,file=/root/src/images/hello-rootfs.ext4,format=raw,if=none -device virtio-blk,drive=test Average boot times after 10 consecutive runs: qemu_init_end: 77.467852 linux_start_kernel: 116.688472 (+39.22062) linux_start_user: 363.033365 (+246.344893) Memory tree: address-space: memory 0000000000000000-ffffffffffffffff (prio 0, i/o): system 0000000000000000-000000001fffffff (prio 0, i/o): alias ram-below-4g @pc.ram 0000000000000000-000000001fffffff 0000000000000000-ffffffffffffffff (prio -1, i/o): pci 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom 00000000000e0000-00000000000fffff (prio 1, i/o): alias isa-bios @pc.bios 0000000000020000-000000000003ffff 00000000febf4000-00000000febf7fff (prio 1, i/o): virtio-pci 00000000febf4000-00000000febf4fff (prio 0, i/o): virtio-pci-common 00000000febf5000-00000000febf5fff (prio 0, i/o): virtio-pci-isr 00000000febf6000-00000000febf6fff (prio 0, i/o): virtio-pci-device 00000000febf7000-00000000febf7fff (prio 0, i/o): virtio-pci-notify 00000000febf8000-00000000febfbfff (prio 1, i/o): virtio-pci 00000000febf8000-00000000febf8fff (prio 0, i/o): virtio-pci-common 00000000febf9000-00000000febf9fff (prio 0, i/o): virtio-pci-isr 00000000febfa000-00000000febfafff (prio 0, i/o): virtio-pci-device 00000000febfb000-00000000febfbfff (prio 0, i/o): virtio-pci-notify 00000000febfe000-00000000febfefff (prio 1, i/o): virtio-serial-pci-msix 00000000febfe000-00000000febfe01f (prio 0, i/o): msix-table 00000000febfe800-00000000febfe807 (prio 0, i/o): msix-pba 00000000febff000-00000000febfffff (prio 1, i/o): virtio-blk-pci-msix 00000000febff000-00000000febff01f (prio 0, i/o): msix-table 00000000febff800-00000000febff807 (prio 0, i/o): msix-pba 00000000fffc0000-00000000ffffffff (prio 0, rom): pc.bios 00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff 00000000000c0000-00000000000c2fff (prio 1000, i/o): alias kvmvapic-rom @pc.ram 00000000000c0000-00000000000c2fff 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c0000-00000000000c3fff [disabled] 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c0000-00000000000c3fff [disabled] 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c0000-00000000000c3fff 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff [disabled] 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c4000-00000000000c7fff [disabled] 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c4000-00000000000c7fff [disabled] 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c4000-00000000000c7fff 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff [disabled] 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c8000-00000000000cbfff [disabled] 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c8000-00000000000cbfff [disabled] 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c8000-00000000000cbfff 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pci 00000000000c8000-00000000000cbfff [disabled] 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000cc000-00000000000cffff [disabled] 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000cc000-00000000000cffff [disabled] 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000cc000-00000000000cffff 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pci 00000000000cc000-00000000000cffff [disabled] 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d0000-00000000000d3fff [disabled] 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d0000-00000000000d3fff [disabled] 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d0000-00000000000d3fff 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pci 00000000000d0000-00000000000d3fff [disabled] 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d4000-00000000000d7fff [disabled] 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d4000-00000000000d7fff [disabled] 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d4000-00000000000d7fff 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pci 00000000000d4000-00000000000d7fff [disabled] 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d8000-00000000000dbfff [disabled] 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d8000-00000000000dbfff [disabled] 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d8000-00000000000dbfff 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pci 00000000000d8000-00000000000dbfff [disabled] 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000dc000-00000000000dffff [disabled] 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000dc000-00000000000dffff [disabled] 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000dc000-00000000000dffff 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pci 00000000000dc000-00000000000dffff [disabled] 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e0000-00000000000e3fff [disabled] 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e0000-00000000000e3fff [disabled] 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e0000-00000000000e3fff 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pci 00000000000e0000-00000000000e3fff [disabled] 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e4000-00000000000e7fff [disabled] 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e4000-00000000000e7fff [disabled] 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e4000-00000000000e7fff 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pci 00000000000e4000-00000000000e7fff [disabled] 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e8000-00000000000ebfff 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e8000-00000000000ebfff [disabled] 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e8000-00000000000ebfff [disabled] 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pci 00000000000e8000-00000000000ebfff [disabled] 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-ram @pc.ram 00000000000ec000-00000000000effff 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pc.ram 00000000000ec000-00000000000effff [disabled] 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-rom @pc.ram 00000000000ec000-00000000000effff [disabled] 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pci 00000000000ec000-00000000000effff [disabled] 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000f0000-00000000000fffff [disabled] 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000f0000-00000000000fffff [disabled] 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000f0000-00000000000fffff 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pci 00000000000f0000-00000000000fffff [disabled] 0000000020000000-0000000020000000 (prio 1, i/o): tseg-blackhole [disabled] 00000000b0000000-00000000bfffffff (prio 0, i/o): pcie-mmcfg-mmio 00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic 00000000fed1c000-00000000fed1ffff (prio 1, i/o): lpc-rcrb-mmio 00000000feda0000-00000000fedbffff (prio 1, i/o): alias smram-open-high @pc.ram 00000000000a0000-00000000000bffff [disabled] 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi address-space: I/O 0000000000000000-000000000000ffff (prio 0, i/o): io 0000000000000000-0000000000000007 (prio 0, i/o): dma-chan 0000000000000008-000000000000000f (prio 0, i/o): dma-cont 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic 0000000000000040-0000000000000043 (prio 0, i/o): kvm-pit 0000000000000060-0000000000000060 (prio 0, i/o): i8042-data 0000000000000061-0000000000000061 (prio 0, i/o): pcspk 0000000000000064-0000000000000064 (prio 0, i/o): i8042-cmd 0000000000000070-0000000000000071 (prio 0, i/o): rtc 0000000000000070-0000000000000070 (prio 0, i/o): rtc-index 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic 0000000000000080-0000000000000080 (prio 0, i/o): ioport80 0000000000000081-0000000000000083 (prio 0, i/o): dma-page 0000000000000087-0000000000000087 (prio 0, i/o): dma-page 0000000000000089-000000000000008b (prio 0, i/o): dma-page 000000000000008f-000000000000008f (prio 0, i/o): dma-page 0000000000000092-0000000000000092 (prio 0, i/o): port92 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic 00000000000000b2-00000000000000b3 (prio 0, i/o): apm-io 00000000000000c0-00000000000000cf (prio 0, i/o): dma-chan 00000000000000d0-00000000000000df (prio 0, i/o): dma-cont 00000000000000f0-00000000000000f0 (prio 0, i/o): ioportF0 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr 0000000000000510-0000000000000511 (prio 0, i/o): fwcfg 0000000000000514-000000000000051b (prio 0, i/o): fwcfg.dma 0000000000000600-000000000000067f (prio 0, i/o): ich9-pm 0000000000000600-0000000000000603 (prio 0, i/o): acpi-evt 0000000000000604-0000000000000605 (prio 0, i/o): acpi-cnt 0000000000000608-000000000000060b (prio 0, i/o): acpi-tmr 0000000000000620-000000000000062f (prio 0, i/o): acpi-gpe0 0000000000000630-0000000000000637 (prio 0, i/o): acpi-smi 0000000000000660-000000000000067f (prio 0, i/o): sm-tco 0000000000000cd8-0000000000000ce3 (prio 0, i/o): acpi-mem-hotplug 0000000000000cf8-0000000000000cfb (prio 0, i/o): pci-conf-idx 0000000000000cf9-0000000000000cf9 (prio 1, i/o): lpc-reset-control 0000000000000cfc-0000000000000cff (prio 0, i/o): pci-conf-data 000000000000c000-000000000000c07f (prio 1, i/o): virtio-pci 000000000000c080-000000000000c0bf (prio 1, i/o): virtio-pci ----------- | microvm | ----------- Command line: ./x86_64-softmmu/qemu-system-x86_64 -m 512m -enable-kvm -M microvm -kernel /root/src/images/vmlinux-5.2 -append "console=hvc0 reboot=k panic=1 root=/dev/vda quiet" -smp 1 -nodefaults -no-user-config -chardev pty,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 -drive id=test,file=/root/src/images/hello-rootfs.ext4,format=raw,if=none -device virtio-blk-device,drive=test Average boot times after 10 consecutive runs: qemu_init_end: 64.043264 linux_start_kernel: 65.481782 (+1.438518) linux_start_user: 114.938353 (+49.456571) Memory tree: address-space: memory 0000000000000000-ffffffffffffffff (prio 0, i/o): system 0000000000000000-000000001fffffff (prio 0, i/o): alias ram-below-4g @microvm.ram 0000000000000000-000000001fffffff 00000000d0000000-00000000d00001ff (prio 0, i/o): virtio-mmio 00000000d0000200-00000000d00003ff (prio 0, i/o): virtio-mmio 00000000d0000400-00000000d00005ff (prio 0, i/o): virtio-mmio 00000000d0000600-00000000d00007ff (prio 0, i/o): virtio-mmio 00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi address-space: I/O 0000000000000000-000000000000ffff (prio 0, i/o): io 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic -------------- | Conclusion | -------------- The average boot time of microvm is a third of Q35's (115ms vs. 363ms), and is smaller on all sections (QEMU initialization, firmware overhead and kernel start-to-user). Microvm's memory tree is also visibly simpler, significantly reducing the exposed surface to the guest. While we can certainly work on making Q35 smaller, I definitely think it's better (and way safer!) having a specialized machine type for a specific use case, than a minimal Q35 whose behavior significantly diverges from a conventional Q35. Sergio. [1] https://github.com/stefano-garzarella/qemu-boot-time [2] https://paste.fedoraproject.org/paste/YZ9Ok-dJtQrc0xxctFm-nw [3] https://paste.fedoraproject.org/paste/sck0jfioAJdMq51HH6wkmA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-18 15:21 ` Sergio Lopez @ 2019-07-19 10:29 ` Stefan Hajnoczi 2019-07-19 13:48 ` Sergio Lopez 0 siblings, 1 reply; 68+ messages in thread From: Stefan Hajnoczi @ 2019-07-19 10:29 UTC (permalink / raw) To: Sergio Lopez Cc: ehabkost, maran.wilson, mst, qemu-devel, kraxel, pbonzini, sgarzare, rth [-- Attachment #1: Type: text/plain, Size: 31402 bytes --] On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote: > > Stefan Hajnoczi <stefanha@gmail.com> writes: > > > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote: > >> Microvm is a machine type inspired by both NEMU and Firecracker, and > >> constructed after the machine model implemented by the latter. > >> > >> It's main purpose is providing users a KVM-only machine type with fast > >> boot times, minimal attack surface (measured as the number of IO ports > >> and MMIO regions exposed to the Guest) and small footprint (specially > >> when combined with the ongoing QEMU modularization effort). > >> > >> Normally, other than the device support provided by KVM itself, > >> microvm only supports virtio-mmio devices. Microvm also includes a > >> legacy mode, which adds an ISA bus with a 16550A serial port, useful > >> for being able to see the early boot kernel messages. > >> > >> Microvm only supports booting PVH-enabled Linux ELF images. Booting > >> other PVH-enabled kernels may be possible, but due to the lack of ACPI > >> and firmware, we're relying on the command line for specifying the > >> location of the virtio-mmio transports. If there's an interest on > >> using this machine type with other kernels, we'll try to find some > >> kind of middle ground solution. > >> > >> This is the list of the exposed IO ports and MMIO regions when running > >> in non-legacy mode: > >> > >> address-space: memory > >> 00000000d0000000-00000000d00001ff (prio 0, i/o): virtio-mmio > >> 00000000d0000200-00000000d00003ff (prio 0, i/o): virtio-mmio > >> 00000000d0000400-00000000d00005ff (prio 0, i/o): virtio-mmio > >> 00000000d0000600-00000000d00007ff (prio 0, i/o): virtio-mmio > >> 00000000d0000800-00000000d00009ff (prio 0, i/o): virtio-mmio > >> 00000000d0000a00-00000000d0000bff (prio 0, i/o): virtio-mmio > >> 00000000d0000c00-00000000d0000dff (prio 0, i/o): virtio-mmio > >> 00000000d0000e00-00000000d0000fff (prio 0, i/o): virtio-mmio > >> 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi > >> > >> address-space: I/O > >> 0000000000000000-000000000000ffff (prio 0, i/o): io > >> 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic > >> 0000000000000040-0000000000000043 (prio 0, i/o): kvm-pit > >> 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic > >> 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic > >> 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr > >> 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr > >> > >> A QEMU instance with the microvm machine type can be invoked this way: > >> > >> - Normal mode: > >> > >> qemu-system-x86_64 -M microvm -m 512m -smp 2 \ > >> -kernel vmlinux -append "console=hvc0 root=/dev/vda" \ > >> -nodefaults -no-user-config \ > >> -chardev pty,id=virtiocon0,server \ > >> -device virtio-serial-device \ > >> -device virtconsole,chardev=virtiocon0 \ > >> -drive id=test,file=test.img,format=raw,if=none \ > >> -device virtio-blk-device,drive=test \ > >> -netdev tap,id=tap0,script=no,downscript=no \ > >> -device virtio-net-device,netdev=tap0 > >> > >> - Legacy mode: > >> > >> qemu-system-x86_64 -M microvm,legacy -m 512m -smp 2 \ > >> -kernel vmlinux -append "console=ttyS0 root=/dev/vda" \ > >> -nodefaults -no-user-config \ > >> -drive id=test,file=test.img,format=raw,if=none \ > >> -device virtio-blk-device,drive=test \ > >> -netdev tap,id=tap0,script=no,downscript=no \ > >> -device virtio-net-device,netdev=tap0 \ > >> -serial stdio > > > > Please post metrics that compare this against a minimal Q35. > > > > With qboot it was later found that SeaBIOS can achieve comparable boot > > times, so it wasn't worth maintaining qboot. > > > > Data is needed to show that microvm is really a significant improvement > > over a minimal Q35. > > I've just ran some numbers using Stefano Garzarella's qemu-boot-time > scripts [1] on a server with 2xIntel Xeon Silver 4114 2.20GHz, using the > upstream QEMU (474f3938d79ab36b9231c9ad3b5a9314c2aeacde) built with > minimal features [2]. The VM boots a minimal kernel [3] without initrd, > using a kata container image as root via virtio-blk (though this isn't > really relevant, as we're just taking measurements until the kernel is > about to exec init). > > To try to make the comparison as fair as possible, I've used a minimal > q35 machine with as few devices as possible. Disabling HPET and PIT at > the same time caused the kernel to get stuck on boot, so I ran two > iterations, one without HPET and the other without PIT: > > > ----------------- > | Q35 with HPET | > ----------------- > > Command line: > > ./x86_64-softmmu/qemu-system-x86_64 -m 512m -enable-kvm -M q35,smbus=off,nvdimm=off,pit=off,vmport=off,sata=off,usb=off,graphics=off -kernel /root/src/images/vmlinux-5.2 -append "console=hvc0 reboot=k panic=1 root=/dev/vda quiet" -smp 1 -nodefaults -no-user-config -chardev pty,id=virtiocon0,server -device virtio-serial -device virtconsole,chardev=virtiocon0 -drive id=test,file=/root/src/images/hello-rootfs.ext4,format=raw,if=none -device virtio-blk,drive=test > > Average boot times after 10 consecutive runs: > > qemu_init_end: 77.637936 > linux_start_kernel: 117.082526 (+39.44459) > linux_start_user: 364.629972 (+247.547446) > > Memory tree: > > address-space: memory > 0000000000000000-ffffffffffffffff (prio 0, i/o): system > 0000000000000000-000000001fffffff (prio 0, i/o): alias ram-below-4g @pc.ram 0000000000000000-000000001fffffff > 0000000000000000-ffffffffffffffff (prio -1, i/o): pci > 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom > 00000000000e0000-00000000000fffff (prio 1, i/o): alias isa-bios @pc.bios 0000000000020000-000000000003ffff > 00000000febf4000-00000000febf7fff (prio 1, i/o): virtio-pci > 00000000febf4000-00000000febf4fff (prio 0, i/o): virtio-pci-common > 00000000febf5000-00000000febf5fff (prio 0, i/o): virtio-pci-isr > 00000000febf6000-00000000febf6fff (prio 0, i/o): virtio-pci-device > 00000000febf7000-00000000febf7fff (prio 0, i/o): virtio-pci-notify > 00000000febf8000-00000000febfbfff (prio 1, i/o): virtio-pci > 00000000febf8000-00000000febf8fff (prio 0, i/o): virtio-pci-common > 00000000febf9000-00000000febf9fff (prio 0, i/o): virtio-pci-isr > 00000000febfa000-00000000febfafff (prio 0, i/o): virtio-pci-device > 00000000febfb000-00000000febfbfff (prio 0, i/o): virtio-pci-notify > 00000000febfe000-00000000febfefff (prio 1, i/o): virtio-serial-pci-msix > 00000000febfe000-00000000febfe01f (prio 0, i/o): msix-table > 00000000febfe800-00000000febfe807 (prio 0, i/o): msix-pba > 00000000febff000-00000000febfffff (prio 1, i/o): virtio-blk-pci-msix > 00000000febff000-00000000febff01f (prio 0, i/o): msix-table > 00000000febff800-00000000febff807 (prio 0, i/o): msix-pba > 00000000fffc0000-00000000ffffffff (prio 0, rom): pc.bios > 00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff > 00000000000c0000-00000000000c2fff (prio 1000, i/o): alias kvmvapic-rom @pc.ram 00000000000c0000-00000000000c2fff > 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c0000-00000000000c3fff [disabled] > 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c0000-00000000000c3fff [disabled] > 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c0000-00000000000c3fff > 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff [disabled] > 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c4000-00000000000c7fff [disabled] > 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c4000-00000000000c7fff [disabled] > 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c4000-00000000000c7fff > 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff [disabled] > 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c8000-00000000000cbfff [disabled] > 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c8000-00000000000cbfff [disabled] > 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c8000-00000000000cbfff > 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pci 00000000000c8000-00000000000cbfff [disabled] > 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000cc000-00000000000cffff [disabled] > 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000cc000-00000000000cffff [disabled] > 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000cc000-00000000000cffff > 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pci 00000000000cc000-00000000000cffff [disabled] > 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d0000-00000000000d3fff [disabled] > 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d0000-00000000000d3fff [disabled] > 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d0000-00000000000d3fff > 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pci 00000000000d0000-00000000000d3fff [disabled] > 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d4000-00000000000d7fff [disabled] > 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d4000-00000000000d7fff [disabled] > 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d4000-00000000000d7fff > 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pci 00000000000d4000-00000000000d7fff [disabled] > 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d8000-00000000000dbfff [disabled] > 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d8000-00000000000dbfff [disabled] > 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d8000-00000000000dbfff > 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pci 00000000000d8000-00000000000dbfff [disabled] > 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000dc000-00000000000dffff [disabled] > 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000dc000-00000000000dffff [disabled] > 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000dc000-00000000000dffff > 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pci 00000000000dc000-00000000000dffff [disabled] > 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e0000-00000000000e3fff [disabled] > 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e0000-00000000000e3fff [disabled] > 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e0000-00000000000e3fff > 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pci 00000000000e0000-00000000000e3fff [disabled] > 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e4000-00000000000e7fff [disabled] > 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e4000-00000000000e7fff [disabled] > 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e4000-00000000000e7fff > 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pci 00000000000e4000-00000000000e7fff [disabled] > 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e8000-00000000000ebfff > 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e8000-00000000000ebfff [disabled] > 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e8000-00000000000ebfff [disabled] > 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pci 00000000000e8000-00000000000ebfff [disabled] > 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-ram @pc.ram 00000000000ec000-00000000000effff > 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pc.ram 00000000000ec000-00000000000effff [disabled] > 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-rom @pc.ram 00000000000ec000-00000000000effff [disabled] > 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pci 00000000000ec000-00000000000effff [disabled] > 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000f0000-00000000000fffff [disabled] > 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000f0000-00000000000fffff [disabled] > 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000f0000-00000000000fffff > 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pci 00000000000f0000-00000000000fffff [disabled] > 0000000020000000-0000000020000000 (prio 1, i/o): tseg-blackhole [disabled] > 00000000b0000000-00000000bfffffff (prio 0, i/o): pcie-mmcfg-mmio > 00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic > 00000000fed00000-00000000fed003ff (prio 0, i/o): hpet > 00000000fed1c000-00000000fed1ffff (prio 1, i/o): lpc-rcrb-mmio > 00000000feda0000-00000000fedbffff (prio 1, i/o): alias smram-open-high @pc.ram 00000000000a0000-00000000000bffff [disabled] > 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi > > address-space: I/O > 0000000000000000-000000000000ffff (prio 0, i/o): io > 0000000000000000-0000000000000007 (prio 0, i/o): dma-chan > 0000000000000008-000000000000000f (prio 0, i/o): dma-cont > 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic > 0000000000000060-0000000000000060 (prio 0, i/o): i8042-data > 0000000000000064-0000000000000064 (prio 0, i/o): i8042-cmd > 0000000000000070-0000000000000071 (prio 0, i/o): rtc > 0000000000000070-0000000000000070 (prio 0, i/o): rtc-index > 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic > 0000000000000080-0000000000000080 (prio 0, i/o): ioport80 > 0000000000000081-0000000000000083 (prio 0, i/o): dma-page > 0000000000000087-0000000000000087 (prio 0, i/o): dma-page > 0000000000000089-000000000000008b (prio 0, i/o): dma-page > 000000000000008f-000000000000008f (prio 0, i/o): dma-page > 0000000000000092-0000000000000092 (prio 0, i/o): port92 > 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic > 00000000000000b2-00000000000000b3 (prio 0, i/o): apm-io > 00000000000000c0-00000000000000cf (prio 0, i/o): dma-chan > 00000000000000d0-00000000000000df (prio 0, i/o): dma-cont > 00000000000000f0-00000000000000f0 (prio 0, i/o): ioportF0 > 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr > 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr > 0000000000000510-0000000000000511 (prio 0, i/o): fwcfg > 0000000000000514-000000000000051b (prio 0, i/o): fwcfg.dma > 0000000000000600-000000000000067f (prio 0, i/o): ich9-pm > 0000000000000600-0000000000000603 (prio 0, i/o): acpi-evt > 0000000000000604-0000000000000605 (prio 0, i/o): acpi-cnt > 0000000000000608-000000000000060b (prio 0, i/o): acpi-tmr > 0000000000000620-000000000000062f (prio 0, i/o): acpi-gpe0 > 0000000000000630-0000000000000637 (prio 0, i/o): acpi-smi > 0000000000000660-000000000000067f (prio 0, i/o): sm-tco > 0000000000000cd8-0000000000000ce3 (prio 0, i/o): acpi-mem-hotplug > 0000000000000cf8-0000000000000cfb (prio 0, i/o): pci-conf-idx > 0000000000000cf9-0000000000000cf9 (prio 1, i/o): lpc-reset-control > 0000000000000cfc-0000000000000cff (prio 0, i/o): pci-conf-data > 000000000000c000-000000000000c07f (prio 1, i/o): virtio-pci > 000000000000c080-000000000000c0bf (prio 1, i/o): virtio-pci > > > ---------------- > | Q35 with PIT | > ---------------- > > Command line: > > ./x86_64-softmmu/qemu-system-x86_64 -m 512m -enable-kvm -M q35,smbus=off,nvdimm=off,pit=on,vmport=off,sata=off,usb=off,graphics=off -no-hpet -kernel /root/src/images/vmlinux-5.2 -append "console=hvc0 reboot=k panic=1 root=/dev/vda quiet" -smp 1 -nodefaults -no-user-config -chardev pty,id=virtiocon0,server -device virtio-serial -device virtconsole,chardev=virtiocon0 -drive id=test,file=/root/src/images/hello-rootfs.ext4,format=raw,if=none -device virtio-blk,drive=test > > Average boot times after 10 consecutive runs: > > qemu_init_end: 77.467852 > linux_start_kernel: 116.688472 (+39.22062) > linux_start_user: 363.033365 (+246.344893) > > Memory tree: > > address-space: memory > 0000000000000000-ffffffffffffffff (prio 0, i/o): system > 0000000000000000-000000001fffffff (prio 0, i/o): alias ram-below-4g @pc.ram 0000000000000000-000000001fffffff > 0000000000000000-ffffffffffffffff (prio -1, i/o): pci > 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom > 00000000000e0000-00000000000fffff (prio 1, i/o): alias isa-bios @pc.bios 0000000000020000-000000000003ffff > 00000000febf4000-00000000febf7fff (prio 1, i/o): virtio-pci > 00000000febf4000-00000000febf4fff (prio 0, i/o): virtio-pci-common > 00000000febf5000-00000000febf5fff (prio 0, i/o): virtio-pci-isr > 00000000febf6000-00000000febf6fff (prio 0, i/o): virtio-pci-device > 00000000febf7000-00000000febf7fff (prio 0, i/o): virtio-pci-notify > 00000000febf8000-00000000febfbfff (prio 1, i/o): virtio-pci > 00000000febf8000-00000000febf8fff (prio 0, i/o): virtio-pci-common > 00000000febf9000-00000000febf9fff (prio 0, i/o): virtio-pci-isr > 00000000febfa000-00000000febfafff (prio 0, i/o): virtio-pci-device > 00000000febfb000-00000000febfbfff (prio 0, i/o): virtio-pci-notify > 00000000febfe000-00000000febfefff (prio 1, i/o): virtio-serial-pci-msix > 00000000febfe000-00000000febfe01f (prio 0, i/o): msix-table > 00000000febfe800-00000000febfe807 (prio 0, i/o): msix-pba > 00000000febff000-00000000febfffff (prio 1, i/o): virtio-blk-pci-msix > 00000000febff000-00000000febff01f (prio 0, i/o): msix-table > 00000000febff800-00000000febff807 (prio 0, i/o): msix-pba > 00000000fffc0000-00000000ffffffff (prio 0, rom): pc.bios > 00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff > 00000000000c0000-00000000000c2fff (prio 1000, i/o): alias kvmvapic-rom @pc.ram 00000000000c0000-00000000000c2fff > 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c0000-00000000000c3fff [disabled] > 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c0000-00000000000c3fff [disabled] > 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c0000-00000000000c3fff > 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff [disabled] > 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c4000-00000000000c7fff [disabled] > 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c4000-00000000000c7fff [disabled] > 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c4000-00000000000c7fff > 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff [disabled] > 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c8000-00000000000cbfff [disabled] > 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c8000-00000000000cbfff [disabled] > 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c8000-00000000000cbfff > 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pci 00000000000c8000-00000000000cbfff [disabled] > 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000cc000-00000000000cffff [disabled] > 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000cc000-00000000000cffff [disabled] > 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000cc000-00000000000cffff > 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pci 00000000000cc000-00000000000cffff [disabled] > 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d0000-00000000000d3fff [disabled] > 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d0000-00000000000d3fff [disabled] > 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d0000-00000000000d3fff > 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pci 00000000000d0000-00000000000d3fff [disabled] > 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d4000-00000000000d7fff [disabled] > 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d4000-00000000000d7fff [disabled] > 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d4000-00000000000d7fff > 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pci 00000000000d4000-00000000000d7fff [disabled] > 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d8000-00000000000dbfff [disabled] > 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d8000-00000000000dbfff [disabled] > 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d8000-00000000000dbfff > 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pci 00000000000d8000-00000000000dbfff [disabled] > 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000dc000-00000000000dffff [disabled] > 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000dc000-00000000000dffff [disabled] > 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000dc000-00000000000dffff > 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pci 00000000000dc000-00000000000dffff [disabled] > 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e0000-00000000000e3fff [disabled] > 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e0000-00000000000e3fff [disabled] > 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e0000-00000000000e3fff > 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pci 00000000000e0000-00000000000e3fff [disabled] > 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e4000-00000000000e7fff [disabled] > 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e4000-00000000000e7fff [disabled] > 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e4000-00000000000e7fff > 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pci 00000000000e4000-00000000000e7fff [disabled] > 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e8000-00000000000ebfff > 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e8000-00000000000ebfff [disabled] > 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e8000-00000000000ebfff [disabled] > 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pci 00000000000e8000-00000000000ebfff [disabled] > 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-ram @pc.ram 00000000000ec000-00000000000effff > 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pc.ram 00000000000ec000-00000000000effff [disabled] > 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-rom @pc.ram 00000000000ec000-00000000000effff [disabled] > 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pci 00000000000ec000-00000000000effff [disabled] > 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000f0000-00000000000fffff [disabled] > 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000f0000-00000000000fffff [disabled] > 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000f0000-00000000000fffff > 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pci 00000000000f0000-00000000000fffff [disabled] > 0000000020000000-0000000020000000 (prio 1, i/o): tseg-blackhole [disabled] > 00000000b0000000-00000000bfffffff (prio 0, i/o): pcie-mmcfg-mmio > 00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic > 00000000fed1c000-00000000fed1ffff (prio 1, i/o): lpc-rcrb-mmio > 00000000feda0000-00000000fedbffff (prio 1, i/o): alias smram-open-high @pc.ram 00000000000a0000-00000000000bffff [disabled] > 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi > > address-space: I/O > 0000000000000000-000000000000ffff (prio 0, i/o): io > 0000000000000000-0000000000000007 (prio 0, i/o): dma-chan > 0000000000000008-000000000000000f (prio 0, i/o): dma-cont > 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic > 0000000000000040-0000000000000043 (prio 0, i/o): kvm-pit > 0000000000000060-0000000000000060 (prio 0, i/o): i8042-data > 0000000000000061-0000000000000061 (prio 0, i/o): pcspk > 0000000000000064-0000000000000064 (prio 0, i/o): i8042-cmd > 0000000000000070-0000000000000071 (prio 0, i/o): rtc > 0000000000000070-0000000000000070 (prio 0, i/o): rtc-index > 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic > 0000000000000080-0000000000000080 (prio 0, i/o): ioport80 > 0000000000000081-0000000000000083 (prio 0, i/o): dma-page > 0000000000000087-0000000000000087 (prio 0, i/o): dma-page > 0000000000000089-000000000000008b (prio 0, i/o): dma-page > 000000000000008f-000000000000008f (prio 0, i/o): dma-page > 0000000000000092-0000000000000092 (prio 0, i/o): port92 > 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic > 00000000000000b2-00000000000000b3 (prio 0, i/o): apm-io > 00000000000000c0-00000000000000cf (prio 0, i/o): dma-chan > 00000000000000d0-00000000000000df (prio 0, i/o): dma-cont > 00000000000000f0-00000000000000f0 (prio 0, i/o): ioportF0 > 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr > 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr > 0000000000000510-0000000000000511 (prio 0, i/o): fwcfg > 0000000000000514-000000000000051b (prio 0, i/o): fwcfg.dma > 0000000000000600-000000000000067f (prio 0, i/o): ich9-pm > 0000000000000600-0000000000000603 (prio 0, i/o): acpi-evt > 0000000000000604-0000000000000605 (prio 0, i/o): acpi-cnt > 0000000000000608-000000000000060b (prio 0, i/o): acpi-tmr > 0000000000000620-000000000000062f (prio 0, i/o): acpi-gpe0 > 0000000000000630-0000000000000637 (prio 0, i/o): acpi-smi > 0000000000000660-000000000000067f (prio 0, i/o): sm-tco > 0000000000000cd8-0000000000000ce3 (prio 0, i/o): acpi-mem-hotplug > 0000000000000cf8-0000000000000cfb (prio 0, i/o): pci-conf-idx > 0000000000000cf9-0000000000000cf9 (prio 1, i/o): lpc-reset-control > 0000000000000cfc-0000000000000cff (prio 0, i/o): pci-conf-data > 000000000000c000-000000000000c07f (prio 1, i/o): virtio-pci > 000000000000c080-000000000000c0bf (prio 1, i/o): virtio-pci > > > ----------- > | microvm | > ----------- > > Command line: > > ./x86_64-softmmu/qemu-system-x86_64 -m 512m -enable-kvm -M microvm -kernel /root/src/images/vmlinux-5.2 -append "console=hvc0 reboot=k panic=1 root=/dev/vda quiet" -smp 1 -nodefaults -no-user-config -chardev pty,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 -drive id=test,file=/root/src/images/hello-rootfs.ext4,format=raw,if=none -device virtio-blk-device,drive=test > > Average boot times after 10 consecutive runs: > > qemu_init_end: 64.043264 > linux_start_kernel: 65.481782 (+1.438518) > linux_start_user: 114.938353 (+49.456571) > > Memory tree: > > address-space: memory > 0000000000000000-ffffffffffffffff (prio 0, i/o): system > 0000000000000000-000000001fffffff (prio 0, i/o): alias ram-below-4g @microvm.ram 0000000000000000-000000001fffffff > 00000000d0000000-00000000d00001ff (prio 0, i/o): virtio-mmio > 00000000d0000200-00000000d00003ff (prio 0, i/o): virtio-mmio > 00000000d0000400-00000000d00005ff (prio 0, i/o): virtio-mmio > 00000000d0000600-00000000d00007ff (prio 0, i/o): virtio-mmio > 00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic > 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi > > address-space: I/O > 0000000000000000-000000000000ffff (prio 0, i/o): io > 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic > > > -------------- > | Conclusion | > -------------- > > The average boot time of microvm is a third of Q35's (115ms vs. 363ms), > and is smaller on all sections (QEMU initialization, firmware overhead > and kernel start-to-user). > > Microvm's memory tree is also visibly simpler, significantly reducing > the exposed surface to the guest. > > While we can certainly work on making Q35 smaller, I definitely think > it's better (and way safer!) having a specialized machine type for a > specific use case, than a minimal Q35 whose behavior significantly > diverges from a conventional Q35. Interesting, so not a 10x difference! This might be amenable to optimization. My concern with microvm is that it's so limited that few users will be able to benefit from the reduced attack surface and faster startup time. I think it's worth investigating slimming down Q35 further first. In terms of startup time the first step would be profiling Q35 kernel startup to find out what's taking so long (firmware initialization, PCI probing, etc)? > Sergio. > > [1] https://github.com/stefano-garzarella/qemu-boot-time > [2] https://paste.fedoraproject.org/paste/YZ9Ok-dJtQrc0xxctFm-nw > [3] https://paste.fedoraproject.org/paste/sck0jfioAJdMq51HH6wkmA [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 488 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-19 10:29 ` Stefan Hajnoczi @ 2019-07-19 13:48 ` Sergio Lopez 2019-07-19 15:09 ` Stefan Hajnoczi 0 siblings, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-07-19 13:48 UTC (permalink / raw) To: Stefan Hajnoczi Cc: ehabkost, maran.wilson, mst, qemu-devel, kraxel, pbonzini, sgarzare, rth [-- Attachment #1: Type: text/plain, Size: 32396 bytes --] Stefan Hajnoczi <stefanha@gmail.com> writes: > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote: >> >> Stefan Hajnoczi <stefanha@gmail.com> writes: >> >> > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote: >> >> Microvm is a machine type inspired by both NEMU and Firecracker, and >> >> constructed after the machine model implemented by the latter. >> >> >> >> It's main purpose is providing users a KVM-only machine type with fast >> >> boot times, minimal attack surface (measured as the number of IO ports >> >> and MMIO regions exposed to the Guest) and small footprint (specially >> >> when combined with the ongoing QEMU modularization effort). >> >> >> >> Normally, other than the device support provided by KVM itself, >> >> microvm only supports virtio-mmio devices. Microvm also includes a >> >> legacy mode, which adds an ISA bus with a 16550A serial port, useful >> >> for being able to see the early boot kernel messages. >> >> >> >> Microvm only supports booting PVH-enabled Linux ELF images. Booting >> >> other PVH-enabled kernels may be possible, but due to the lack of ACPI >> >> and firmware, we're relying on the command line for specifying the >> >> location of the virtio-mmio transports. If there's an interest on >> >> using this machine type with other kernels, we'll try to find some >> >> kind of middle ground solution. >> >> >> >> This is the list of the exposed IO ports and MMIO regions when running >> >> in non-legacy mode: >> >> >> >> address-space: memory >> >> 00000000d0000000-00000000d00001ff (prio 0, i/o): virtio-mmio >> >> 00000000d0000200-00000000d00003ff (prio 0, i/o): virtio-mmio >> >> 00000000d0000400-00000000d00005ff (prio 0, i/o): virtio-mmio >> >> 00000000d0000600-00000000d00007ff (prio 0, i/o): virtio-mmio >> >> 00000000d0000800-00000000d00009ff (prio 0, i/o): virtio-mmio >> >> 00000000d0000a00-00000000d0000bff (prio 0, i/o): virtio-mmio >> >> 00000000d0000c00-00000000d0000dff (prio 0, i/o): virtio-mmio >> >> 00000000d0000e00-00000000d0000fff (prio 0, i/o): virtio-mmio >> >> 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi >> >> >> >> address-space: I/O >> >> 0000000000000000-000000000000ffff (prio 0, i/o): io >> >> 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic >> >> 0000000000000040-0000000000000043 (prio 0, i/o): kvm-pit >> >> 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic >> >> 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic >> >> 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr >> >> 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr >> >> >> >> A QEMU instance with the microvm machine type can be invoked this way: >> >> >> >> - Normal mode: >> >> >> >> qemu-system-x86_64 -M microvm -m 512m -smp 2 \ >> >> -kernel vmlinux -append "console=hvc0 root=/dev/vda" \ >> >> -nodefaults -no-user-config \ >> >> -chardev pty,id=virtiocon0,server \ >> >> -device virtio-serial-device \ >> >> -device virtconsole,chardev=virtiocon0 \ >> >> -drive id=test,file=test.img,format=raw,if=none \ >> >> -device virtio-blk-device,drive=test \ >> >> -netdev tap,id=tap0,script=no,downscript=no \ >> >> -device virtio-net-device,netdev=tap0 >> >> >> >> - Legacy mode: >> >> >> >> qemu-system-x86_64 -M microvm,legacy -m 512m -smp 2 \ >> >> -kernel vmlinux -append "console=ttyS0 root=/dev/vda" \ >> >> -nodefaults -no-user-config \ >> >> -drive id=test,file=test.img,format=raw,if=none \ >> >> -device virtio-blk-device,drive=test \ >> >> -netdev tap,id=tap0,script=no,downscript=no \ >> >> -device virtio-net-device,netdev=tap0 \ >> >> -serial stdio >> > >> > Please post metrics that compare this against a minimal Q35. >> > >> > With qboot it was later found that SeaBIOS can achieve comparable boot >> > times, so it wasn't worth maintaining qboot. >> > >> > Data is needed to show that microvm is really a significant improvement >> > over a minimal Q35. >> >> I've just ran some numbers using Stefano Garzarella's qemu-boot-time >> scripts [1] on a server with 2xIntel Xeon Silver 4114 2.20GHz, using the >> upstream QEMU (474f3938d79ab36b9231c9ad3b5a9314c2aeacde) built with >> minimal features [2]. The VM boots a minimal kernel [3] without initrd, >> using a kata container image as root via virtio-blk (though this isn't >> really relevant, as we're just taking measurements until the kernel is >> about to exec init). >> >> To try to make the comparison as fair as possible, I've used a minimal >> q35 machine with as few devices as possible. Disabling HPET and PIT at >> the same time caused the kernel to get stuck on boot, so I ran two >> iterations, one without HPET and the other without PIT: >> >> >> ----------------- >> | Q35 with HPET | >> ----------------- >> >> Command line: >> >> ./x86_64-softmmu/qemu-system-x86_64 -m 512m -enable-kvm -M q35,smbus=off,nvdimm=off,pit=off,vmport=off,sata=off,usb=off,graphics=off -kernel /root/src/images/vmlinux-5.2 -append "console=hvc0 reboot=k panic=1 root=/dev/vda quiet" -smp 1 -nodefaults -no-user-config -chardev pty,id=virtiocon0,server -device virtio-serial -device virtconsole,chardev=virtiocon0 -drive id=test,file=/root/src/images/hello-rootfs.ext4,format=raw,if=none -device virtio-blk,drive=test >> >> Average boot times after 10 consecutive runs: >> >> qemu_init_end: 77.637936 >> linux_start_kernel: 117.082526 (+39.44459) >> linux_start_user: 364.629972 (+247.547446) >> >> Memory tree: >> >> address-space: memory >> 0000000000000000-ffffffffffffffff (prio 0, i/o): system >> 0000000000000000-000000001fffffff (prio 0, i/o): alias ram-below-4g @pc.ram 0000000000000000-000000001fffffff >> 0000000000000000-ffffffffffffffff (prio -1, i/o): pci >> 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom >> 00000000000e0000-00000000000fffff (prio 1, i/o): alias isa-bios @pc.bios 0000000000020000-000000000003ffff >> 00000000febf4000-00000000febf7fff (prio 1, i/o): virtio-pci >> 00000000febf4000-00000000febf4fff (prio 0, i/o): virtio-pci-common >> 00000000febf5000-00000000febf5fff (prio 0, i/o): virtio-pci-isr >> 00000000febf6000-00000000febf6fff (prio 0, i/o): virtio-pci-device >> 00000000febf7000-00000000febf7fff (prio 0, i/o): virtio-pci-notify >> 00000000febf8000-00000000febfbfff (prio 1, i/o): virtio-pci >> 00000000febf8000-00000000febf8fff (prio 0, i/o): virtio-pci-common >> 00000000febf9000-00000000febf9fff (prio 0, i/o): virtio-pci-isr >> 00000000febfa000-00000000febfafff (prio 0, i/o): virtio-pci-device >> 00000000febfb000-00000000febfbfff (prio 0, i/o): virtio-pci-notify >> 00000000febfe000-00000000febfefff (prio 1, i/o): virtio-serial-pci-msix >> 00000000febfe000-00000000febfe01f (prio 0, i/o): msix-table >> 00000000febfe800-00000000febfe807 (prio 0, i/o): msix-pba >> 00000000febff000-00000000febfffff (prio 1, i/o): virtio-blk-pci-msix >> 00000000febff000-00000000febff01f (prio 0, i/o): msix-table >> 00000000febff800-00000000febff807 (prio 0, i/o): msix-pba >> 00000000fffc0000-00000000ffffffff (prio 0, rom): pc.bios >> 00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff >> 00000000000c0000-00000000000c2fff (prio 1000, i/o): alias kvmvapic-rom @pc.ram 00000000000c0000-00000000000c2fff >> 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c0000-00000000000c3fff [disabled] >> 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c0000-00000000000c3fff [disabled] >> 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c0000-00000000000c3fff >> 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff [disabled] >> 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c4000-00000000000c7fff [disabled] >> 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c4000-00000000000c7fff [disabled] >> 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c4000-00000000000c7fff >> 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff [disabled] >> 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c8000-00000000000cbfff [disabled] >> 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c8000-00000000000cbfff [disabled] >> 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c8000-00000000000cbfff >> 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pci 00000000000c8000-00000000000cbfff [disabled] >> 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000cc000-00000000000cffff [disabled] >> 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000cc000-00000000000cffff [disabled] >> 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000cc000-00000000000cffff >> 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pci 00000000000cc000-00000000000cffff [disabled] >> 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d0000-00000000000d3fff [disabled] >> 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d0000-00000000000d3fff [disabled] >> 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d0000-00000000000d3fff >> 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pci 00000000000d0000-00000000000d3fff [disabled] >> 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d4000-00000000000d7fff [disabled] >> 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d4000-00000000000d7fff [disabled] >> 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d4000-00000000000d7fff >> 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pci 00000000000d4000-00000000000d7fff [disabled] >> 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d8000-00000000000dbfff [disabled] >> 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d8000-00000000000dbfff [disabled] >> 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d8000-00000000000dbfff >> 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pci 00000000000d8000-00000000000dbfff [disabled] >> 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000dc000-00000000000dffff [disabled] >> 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000dc000-00000000000dffff [disabled] >> 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000dc000-00000000000dffff >> 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pci 00000000000dc000-00000000000dffff [disabled] >> 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e0000-00000000000e3fff [disabled] >> 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e0000-00000000000e3fff [disabled] >> 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e0000-00000000000e3fff >> 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pci 00000000000e0000-00000000000e3fff [disabled] >> 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e4000-00000000000e7fff [disabled] >> 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e4000-00000000000e7fff [disabled] >> 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e4000-00000000000e7fff >> 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pci 00000000000e4000-00000000000e7fff [disabled] >> 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e8000-00000000000ebfff >> 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e8000-00000000000ebfff [disabled] >> 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e8000-00000000000ebfff [disabled] >> 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pci 00000000000e8000-00000000000ebfff [disabled] >> 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-ram @pc.ram 00000000000ec000-00000000000effff >> 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pc.ram 00000000000ec000-00000000000effff [disabled] >> 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-rom @pc.ram 00000000000ec000-00000000000effff [disabled] >> 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pci 00000000000ec000-00000000000effff [disabled] >> 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000f0000-00000000000fffff [disabled] >> 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000f0000-00000000000fffff [disabled] >> 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000f0000-00000000000fffff >> 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pci 00000000000f0000-00000000000fffff [disabled] >> 0000000020000000-0000000020000000 (prio 1, i/o): tseg-blackhole [disabled] >> 00000000b0000000-00000000bfffffff (prio 0, i/o): pcie-mmcfg-mmio >> 00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic >> 00000000fed00000-00000000fed003ff (prio 0, i/o): hpet >> 00000000fed1c000-00000000fed1ffff (prio 1, i/o): lpc-rcrb-mmio >> 00000000feda0000-00000000fedbffff (prio 1, i/o): alias smram-open-high @pc.ram 00000000000a0000-00000000000bffff [disabled] >> 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi >> >> address-space: I/O >> 0000000000000000-000000000000ffff (prio 0, i/o): io >> 0000000000000000-0000000000000007 (prio 0, i/o): dma-chan >> 0000000000000008-000000000000000f (prio 0, i/o): dma-cont >> 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic >> 0000000000000060-0000000000000060 (prio 0, i/o): i8042-data >> 0000000000000064-0000000000000064 (prio 0, i/o): i8042-cmd >> 0000000000000070-0000000000000071 (prio 0, i/o): rtc >> 0000000000000070-0000000000000070 (prio 0, i/o): rtc-index >> 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic >> 0000000000000080-0000000000000080 (prio 0, i/o): ioport80 >> 0000000000000081-0000000000000083 (prio 0, i/o): dma-page >> 0000000000000087-0000000000000087 (prio 0, i/o): dma-page >> 0000000000000089-000000000000008b (prio 0, i/o): dma-page >> 000000000000008f-000000000000008f (prio 0, i/o): dma-page >> 0000000000000092-0000000000000092 (prio 0, i/o): port92 >> 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic >> 00000000000000b2-00000000000000b3 (prio 0, i/o): apm-io >> 00000000000000c0-00000000000000cf (prio 0, i/o): dma-chan >> 00000000000000d0-00000000000000df (prio 0, i/o): dma-cont >> 00000000000000f0-00000000000000f0 (prio 0, i/o): ioportF0 >> 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr >> 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr >> 0000000000000510-0000000000000511 (prio 0, i/o): fwcfg >> 0000000000000514-000000000000051b (prio 0, i/o): fwcfg.dma >> 0000000000000600-000000000000067f (prio 0, i/o): ich9-pm >> 0000000000000600-0000000000000603 (prio 0, i/o): acpi-evt >> 0000000000000604-0000000000000605 (prio 0, i/o): acpi-cnt >> 0000000000000608-000000000000060b (prio 0, i/o): acpi-tmr >> 0000000000000620-000000000000062f (prio 0, i/o): acpi-gpe0 >> 0000000000000630-0000000000000637 (prio 0, i/o): acpi-smi >> 0000000000000660-000000000000067f (prio 0, i/o): sm-tco >> 0000000000000cd8-0000000000000ce3 (prio 0, i/o): acpi-mem-hotplug >> 0000000000000cf8-0000000000000cfb (prio 0, i/o): pci-conf-idx >> 0000000000000cf9-0000000000000cf9 (prio 1, i/o): lpc-reset-control >> 0000000000000cfc-0000000000000cff (prio 0, i/o): pci-conf-data >> 000000000000c000-000000000000c07f (prio 1, i/o): virtio-pci >> 000000000000c080-000000000000c0bf (prio 1, i/o): virtio-pci >> >> >> ---------------- >> | Q35 with PIT | >> ---------------- >> >> Command line: >> >> ./x86_64-softmmu/qemu-system-x86_64 -m 512m -enable-kvm -M q35,smbus=off,nvdimm=off,pit=on,vmport=off,sata=off,usb=off,graphics=off -no-hpet -kernel /root/src/images/vmlinux-5.2 -append "console=hvc0 reboot=k panic=1 root=/dev/vda quiet" -smp 1 -nodefaults -no-user-config -chardev pty,id=virtiocon0,server -device virtio-serial -device virtconsole,chardev=virtiocon0 -drive id=test,file=/root/src/images/hello-rootfs.ext4,format=raw,if=none -device virtio-blk,drive=test >> >> Average boot times after 10 consecutive runs: >> >> qemu_init_end: 77.467852 >> linux_start_kernel: 116.688472 (+39.22062) >> linux_start_user: 363.033365 (+246.344893) >> >> Memory tree: >> >> address-space: memory >> 0000000000000000-ffffffffffffffff (prio 0, i/o): system >> 0000000000000000-000000001fffffff (prio 0, i/o): alias ram-below-4g @pc.ram 0000000000000000-000000001fffffff >> 0000000000000000-ffffffffffffffff (prio -1, i/o): pci >> 00000000000c0000-00000000000dffff (prio 1, rom): pc.rom >> 00000000000e0000-00000000000fffff (prio 1, i/o): alias isa-bios @pc.bios 0000000000020000-000000000003ffff >> 00000000febf4000-00000000febf7fff (prio 1, i/o): virtio-pci >> 00000000febf4000-00000000febf4fff (prio 0, i/o): virtio-pci-common >> 00000000febf5000-00000000febf5fff (prio 0, i/o): virtio-pci-isr >> 00000000febf6000-00000000febf6fff (prio 0, i/o): virtio-pci-device >> 00000000febf7000-00000000febf7fff (prio 0, i/o): virtio-pci-notify >> 00000000febf8000-00000000febfbfff (prio 1, i/o): virtio-pci >> 00000000febf8000-00000000febf8fff (prio 0, i/o): virtio-pci-common >> 00000000febf9000-00000000febf9fff (prio 0, i/o): virtio-pci-isr >> 00000000febfa000-00000000febfafff (prio 0, i/o): virtio-pci-device >> 00000000febfb000-00000000febfbfff (prio 0, i/o): virtio-pci-notify >> 00000000febfe000-00000000febfefff (prio 1, i/o): virtio-serial-pci-msix >> 00000000febfe000-00000000febfe01f (prio 0, i/o): msix-table >> 00000000febfe800-00000000febfe807 (prio 0, i/o): msix-pba >> 00000000febff000-00000000febfffff (prio 1, i/o): virtio-blk-pci-msix >> 00000000febff000-00000000febff01f (prio 0, i/o): msix-table >> 00000000febff800-00000000febff807 (prio 0, i/o): msix-pba >> 00000000fffc0000-00000000ffffffff (prio 0, rom): pc.bios >> 00000000000a0000-00000000000bffff (prio 1, i/o): alias smram-region @pci 00000000000a0000-00000000000bffff >> 00000000000c0000-00000000000c2fff (prio 1000, i/o): alias kvmvapic-rom @pc.ram 00000000000c0000-00000000000c2fff >> 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c0000-00000000000c3fff [disabled] >> 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c0000-00000000000c3fff [disabled] >> 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c0000-00000000000c3fff >> 00000000000c0000-00000000000c3fff (prio 1, i/o): alias pam-pci @pci 00000000000c0000-00000000000c3fff [disabled] >> 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c4000-00000000000c7fff [disabled] >> 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c4000-00000000000c7fff [disabled] >> 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c4000-00000000000c7fff >> 00000000000c4000-00000000000c7fff (prio 1, i/o): alias pam-pci @pci 00000000000c4000-00000000000c7fff [disabled] >> 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000c8000-00000000000cbfff [disabled] >> 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000c8000-00000000000cbfff [disabled] >> 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000c8000-00000000000cbfff >> 00000000000c8000-00000000000cbfff (prio 1, i/o): alias pam-pci @pci 00000000000c8000-00000000000cbfff [disabled] >> 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000cc000-00000000000cffff [disabled] >> 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000cc000-00000000000cffff [disabled] >> 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000cc000-00000000000cffff >> 00000000000cc000-00000000000cffff (prio 1, i/o): alias pam-pci @pci 00000000000cc000-00000000000cffff [disabled] >> 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d0000-00000000000d3fff [disabled] >> 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d0000-00000000000d3fff [disabled] >> 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d0000-00000000000d3fff >> 00000000000d0000-00000000000d3fff (prio 1, i/o): alias pam-pci @pci 00000000000d0000-00000000000d3fff [disabled] >> 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d4000-00000000000d7fff [disabled] >> 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d4000-00000000000d7fff [disabled] >> 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d4000-00000000000d7fff >> 00000000000d4000-00000000000d7fff (prio 1, i/o): alias pam-pci @pci 00000000000d4000-00000000000d7fff [disabled] >> 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000d8000-00000000000dbfff [disabled] >> 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000d8000-00000000000dbfff [disabled] >> 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000d8000-00000000000dbfff >> 00000000000d8000-00000000000dbfff (prio 1, i/o): alias pam-pci @pci 00000000000d8000-00000000000dbfff [disabled] >> 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000dc000-00000000000dffff [disabled] >> 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000dc000-00000000000dffff [disabled] >> 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000dc000-00000000000dffff >> 00000000000dc000-00000000000dffff (prio 1, i/o): alias pam-pci @pci 00000000000dc000-00000000000dffff [disabled] >> 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e0000-00000000000e3fff [disabled] >> 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e0000-00000000000e3fff [disabled] >> 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e0000-00000000000e3fff >> 00000000000e0000-00000000000e3fff (prio 1, i/o): alias pam-pci @pci 00000000000e0000-00000000000e3fff [disabled] >> 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e4000-00000000000e7fff [disabled] >> 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e4000-00000000000e7fff [disabled] >> 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e4000-00000000000e7fff >> 00000000000e4000-00000000000e7fff (prio 1, i/o): alias pam-pci @pci 00000000000e4000-00000000000e7fff [disabled] >> 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-ram @pc.ram 00000000000e8000-00000000000ebfff >> 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pc.ram 00000000000e8000-00000000000ebfff [disabled] >> 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-rom @pc.ram 00000000000e8000-00000000000ebfff [disabled] >> 00000000000e8000-00000000000ebfff (prio 1, i/o): alias pam-pci @pci 00000000000e8000-00000000000ebfff [disabled] >> 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-ram @pc.ram 00000000000ec000-00000000000effff >> 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pc.ram 00000000000ec000-00000000000effff [disabled] >> 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-rom @pc.ram 00000000000ec000-00000000000effff [disabled] >> 00000000000ec000-00000000000effff (prio 1, i/o): alias pam-pci @pci 00000000000ec000-00000000000effff [disabled] >> 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-ram @pc.ram 00000000000f0000-00000000000fffff [disabled] >> 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pc.ram 00000000000f0000-00000000000fffff [disabled] >> 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-rom @pc.ram 00000000000f0000-00000000000fffff >> 00000000000f0000-00000000000fffff (prio 1, i/o): alias pam-pci @pci 00000000000f0000-00000000000fffff [disabled] >> 0000000020000000-0000000020000000 (prio 1, i/o): tseg-blackhole [disabled] >> 00000000b0000000-00000000bfffffff (prio 0, i/o): pcie-mmcfg-mmio >> 00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic >> 00000000fed1c000-00000000fed1ffff (prio 1, i/o): lpc-rcrb-mmio >> 00000000feda0000-00000000fedbffff (prio 1, i/o): alias smram-open-high @pc.ram 00000000000a0000-00000000000bffff [disabled] >> 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi >> >> address-space: I/O >> 0000000000000000-000000000000ffff (prio 0, i/o): io >> 0000000000000000-0000000000000007 (prio 0, i/o): dma-chan >> 0000000000000008-000000000000000f (prio 0, i/o): dma-cont >> 0000000000000020-0000000000000021 (prio 0, i/o): kvm-pic >> 0000000000000040-0000000000000043 (prio 0, i/o): kvm-pit >> 0000000000000060-0000000000000060 (prio 0, i/o): i8042-data >> 0000000000000061-0000000000000061 (prio 0, i/o): pcspk >> 0000000000000064-0000000000000064 (prio 0, i/o): i8042-cmd >> 0000000000000070-0000000000000071 (prio 0, i/o): rtc >> 0000000000000070-0000000000000070 (prio 0, i/o): rtc-index >> 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic >> 0000000000000080-0000000000000080 (prio 0, i/o): ioport80 >> 0000000000000081-0000000000000083 (prio 0, i/o): dma-page >> 0000000000000087-0000000000000087 (prio 0, i/o): dma-page >> 0000000000000089-000000000000008b (prio 0, i/o): dma-page >> 000000000000008f-000000000000008f (prio 0, i/o): dma-page >> 0000000000000092-0000000000000092 (prio 0, i/o): port92 >> 00000000000000a0-00000000000000a1 (prio 0, i/o): kvm-pic >> 00000000000000b2-00000000000000b3 (prio 0, i/o): apm-io >> 00000000000000c0-00000000000000cf (prio 0, i/o): dma-chan >> 00000000000000d0-00000000000000df (prio 0, i/o): dma-cont >> 00000000000000f0-00000000000000f0 (prio 0, i/o): ioportF0 >> 00000000000004d0-00000000000004d0 (prio 0, i/o): kvm-elcr >> 00000000000004d1-00000000000004d1 (prio 0, i/o): kvm-elcr >> 0000000000000510-0000000000000511 (prio 0, i/o): fwcfg >> 0000000000000514-000000000000051b (prio 0, i/o): fwcfg.dma >> 0000000000000600-000000000000067f (prio 0, i/o): ich9-pm >> 0000000000000600-0000000000000603 (prio 0, i/o): acpi-evt >> 0000000000000604-0000000000000605 (prio 0, i/o): acpi-cnt >> 0000000000000608-000000000000060b (prio 0, i/o): acpi-tmr >> 0000000000000620-000000000000062f (prio 0, i/o): acpi-gpe0 >> 0000000000000630-0000000000000637 (prio 0, i/o): acpi-smi >> 0000000000000660-000000000000067f (prio 0, i/o): sm-tco >> 0000000000000cd8-0000000000000ce3 (prio 0, i/o): acpi-mem-hotplug >> 0000000000000cf8-0000000000000cfb (prio 0, i/o): pci-conf-idx >> 0000000000000cf9-0000000000000cf9 (prio 1, i/o): lpc-reset-control >> 0000000000000cfc-0000000000000cff (prio 0, i/o): pci-conf-data >> 000000000000c000-000000000000c07f (prio 1, i/o): virtio-pci >> 000000000000c080-000000000000c0bf (prio 1, i/o): virtio-pci >> >> >> ----------- >> | microvm | >> ----------- >> >> Command line: >> >> ./x86_64-softmmu/qemu-system-x86_64 -m 512m -enable-kvm -M microvm -kernel /root/src/images/vmlinux-5.2 -append "console=hvc0 reboot=k panic=1 root=/dev/vda quiet" -smp 1 -nodefaults -no-user-config -chardev pty,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 -drive id=test,file=/root/src/images/hello-rootfs.ext4,format=raw,if=none -device virtio-blk-device,drive=test >> >> Average boot times after 10 consecutive runs: >> >> qemu_init_end: 64.043264 >> linux_start_kernel: 65.481782 (+1.438518) >> linux_start_user: 114.938353 (+49.456571) >> >> Memory tree: >> >> address-space: memory >> 0000000000000000-ffffffffffffffff (prio 0, i/o): system >> 0000000000000000-000000001fffffff (prio 0, i/o): alias ram-below-4g @microvm.ram 0000000000000000-000000001fffffff >> 00000000d0000000-00000000d00001ff (prio 0, i/o): virtio-mmio >> 00000000d0000200-00000000d00003ff (prio 0, i/o): virtio-mmio >> 00000000d0000400-00000000d00005ff (prio 0, i/o): virtio-mmio >> 00000000d0000600-00000000d00007ff (prio 0, i/o): virtio-mmio >> 00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic >> 00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi >> >> address-space: I/O >> 0000000000000000-000000000000ffff (prio 0, i/o): io >> 000000000000007e-000000000000007f (prio 0, i/o): kvmvapic >> >> >> -------------- >> | Conclusion | >> -------------- >> >> The average boot time of microvm is a third of Q35's (115ms vs. 363ms), >> and is smaller on all sections (QEMU initialization, firmware overhead >> and kernel start-to-user). >> >> Microvm's memory tree is also visibly simpler, significantly reducing >> the exposed surface to the guest. >> >> While we can certainly work on making Q35 smaller, I definitely think >> it's better (and way safer!) having a specialized machine type for a >> specific use case, than a minimal Q35 whose behavior significantly >> diverges from a conventional Q35. > > Interesting, so not a 10x difference! This might be amenable to > optimization. > > My concern with microvm is that it's so limited that few users will be > able to benefit from the reduced attack surface and faster startup time. > I think it's worth investigating slimming down Q35 further first. > > In terms of startup time the first step would be profiling Q35 kernel > startup to find out what's taking so long (firmware initialization, PCI > probing, etc)? Some findings: 1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host") saves a whooping 120ms by avoiding the APIC timer calibration at arch/x86/kernel/apic/apic.c:calibrate_APIC_clock Average boot time with "-cpu host" qemu_init_end: 76.408950 linux_start_kernel: 116.166142 (+39.757192) linux_start_user: 242.954347 (+126.788205) Average boot time with default "cpu" qemu_init_end: 77.467852 linux_start_kernel: 116.688472 (+39.22062) linux_start_user: 363.033365 (+246.344893) 2. The other 130ms are a direct result of PCI and ACPI presence (tested with a kernel without support for those elements). I'll publish some detailed numbers next week. Sergio. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-19 13:48 ` Sergio Lopez @ 2019-07-19 15:09 ` Stefan Hajnoczi 2019-07-19 15:42 ` Montes, Julio 0 siblings, 1 reply; 68+ messages in thread From: Stefan Hajnoczi @ 2019-07-19 15:09 UTC (permalink / raw) To: Sergio Lopez Cc: Eduardo Habkost, Maran Wilson, Michael S. Tsirkin, qemu-devel, Gerd Hoffmann, Paolo Bonzini, Stefano Garzarella, Richard Henderson On Fri, Jul 19, 2019 at 2:48 PM Sergio Lopez <slp@redhat.com> wrote: > Stefan Hajnoczi <stefanha@gmail.com> writes: > > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote: > >> > >> Stefan Hajnoczi <stefanha@gmail.com> writes: > >> > >> > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote: > >> -------------- > >> | Conclusion | > >> -------------- > >> > >> The average boot time of microvm is a third of Q35's (115ms vs. 363ms), > >> and is smaller on all sections (QEMU initialization, firmware overhead > >> and kernel start-to-user). > >> > >> Microvm's memory tree is also visibly simpler, significantly reducing > >> the exposed surface to the guest. > >> > >> While we can certainly work on making Q35 smaller, I definitely think > >> it's better (and way safer!) having a specialized machine type for a > >> specific use case, than a minimal Q35 whose behavior significantly > >> diverges from a conventional Q35. > > > > Interesting, so not a 10x difference! This might be amenable to > > optimization. > > > > My concern with microvm is that it's so limited that few users will be > > able to benefit from the reduced attack surface and faster startup time. > > I think it's worth investigating slimming down Q35 further first. > > > > In terms of startup time the first step would be profiling Q35 kernel > > startup to find out what's taking so long (firmware initialization, PCI > > probing, etc)? > > Some findings: > > 1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host") saves a > whooping 120ms by avoiding the APIC timer calibration at > arch/x86/kernel/apic/apic.c:calibrate_APIC_clock > > Average boot time with "-cpu host" > qemu_init_end: 76.408950 > linux_start_kernel: 116.166142 (+39.757192) > linux_start_user: 242.954347 (+126.788205) > > Average boot time with default "cpu" > qemu_init_end: 77.467852 > linux_start_kernel: 116.688472 (+39.22062) > linux_start_user: 363.033365 (+246.344893) \o/ > 2. The other 130ms are a direct result of PCI and ACPI presence (tested > with a kernel without support for those elements). I'll publish some > detailed numbers next week. Here are the Kata Containers kernel parameters: var kernelParams = []Param{ {"tsc", "reliable"}, {"no_timer_check", ""}, {"rcupdate.rcu_expedited", "1"}, {"i8042.direct", "1"}, {"i8042.dumbkbd", "1"}, {"i8042.nopnp", "1"}, {"i8042.noaux", "1"}, {"noreplace-smp", ""}, {"reboot", "k"}, {"console", "hvc0"}, {"console", "hvc1"}, {"iommu", "off"}, {"cryptomgr.notests", ""}, {"net.ifnames", "0"}, {"pci", "lastbus=0"}, } pci lastbus=0 looks interesting and so do some of the others :). Stefan ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-19 15:09 ` Stefan Hajnoczi @ 2019-07-19 15:42 ` Montes, Julio 2019-07-23 8:43 ` Sergio Lopez 0 siblings, 1 reply; 68+ messages in thread From: Montes, Julio @ 2019-07-19 15:42 UTC (permalink / raw) To: stefanha, slp Cc: ehabkost, mst, maran.wilson, qemu-devel, kraxel, pbonzini, rth, sgarzare On Fri, 2019-07-19 at 16:09 +0100, Stefan Hajnoczi wrote: > On Fri, Jul 19, 2019 at 2:48 PM Sergio Lopez <slp@redhat.com> wrote: > > Stefan Hajnoczi <stefanha@gmail.com> writes: > > > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote: > > > > Stefan Hajnoczi <stefanha@gmail.com> writes: > > > > > > > > > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote: > > > > -------------- > > > > | Conclusion | > > > > -------------- > > > > > > > > The average boot time of microvm is a third of Q35's (115ms vs. > > > > 363ms), > > > > and is smaller on all sections (QEMU initialization, firmware > > > > overhead > > > > and kernel start-to-user). > > > > > > > > Microvm's memory tree is also visibly simpler, significantly > > > > reducing > > > > the exposed surface to the guest. > > > > > > > > While we can certainly work on making Q35 smaller, I definitely > > > > think > > > > it's better (and way safer!) having a specialized machine type > > > > for a > > > > specific use case, than a minimal Q35 whose behavior > > > > significantly > > > > diverges from a conventional Q35. > > > > > > Interesting, so not a 10x difference! This might be amenable to > > > optimization. > > > > > > My concern with microvm is that it's so limited that few users > > > will be > > > able to benefit from the reduced attack surface and faster > > > startup time. > > > I think it's worth investigating slimming down Q35 further first. > > > > > > In terms of startup time the first step would be profiling Q35 > > > kernel > > > startup to find out what's taking so long (firmware > > > initialization, PCI > > > probing, etc)? > > > > Some findings: > > > > 1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host") > > saves a > > whooping 120ms by avoiding the APIC timer calibration at > > arch/x86/kernel/apic/apic.c:calibrate_APIC_clock > > > > Average boot time with "-cpu host" > > qemu_init_end: 76.408950 > > linux_start_kernel: 116.166142 (+39.757192) > > linux_start_user: 242.954347 (+126.788205) > > > > Average boot time with default "cpu" > > qemu_init_end: 77.467852 > > linux_start_kernel: 116.688472 (+39.22062) > > linux_start_user: 363.033365 (+246.344893) > > \o/ > > > 2. The other 130ms are a direct result of PCI and ACPI presence > > (tested > > with a kernel without support for those elements). I'll publish > > some > > detailed numbers next week. > > Here are the Kata Containers kernel parameters: > > var kernelParams = []Param{ > {"tsc", "reliable"}, > {"no_timer_check", ""}, > {"rcupdate.rcu_expedited", "1"}, > {"i8042.direct", "1"}, > {"i8042.dumbkbd", "1"}, > {"i8042.nopnp", "1"}, > {"i8042.noaux", "1"}, > {"noreplace-smp", ""}, > {"reboot", "k"}, > {"console", "hvc0"}, > {"console", "hvc1"}, > {"iommu", "off"}, > {"cryptomgr.notests", ""}, > {"net.ifnames", "0"}, > {"pci", "lastbus=0"}, > } > > pci lastbus=0 looks interesting and so do some of the others :). > yeah, pci=lastbus=0 is very helpful to reduce the boot time in q35, kernel won't scan the 255.. buses :) > Stefan > ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-19 15:42 ` Montes, Julio @ 2019-07-23 8:43 ` Sergio Lopez 2019-07-23 9:47 ` Stefan Hajnoczi 0 siblings, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-07-23 8:43 UTC (permalink / raw) To: Montes, Julio Cc: ehabkost, mst, stefanha, maran.wilson, qemu-devel, kraxel, pbonzini, rth, sgarzare [-- Attachment #1: Type: text/plain, Size: 4151 bytes --] Montes, Julio <julio.montes@intel.com> writes: > On Fri, 2019-07-19 at 16:09 +0100, Stefan Hajnoczi wrote: >> On Fri, Jul 19, 2019 at 2:48 PM Sergio Lopez <slp@redhat.com> wrote: >> > Stefan Hajnoczi <stefanha@gmail.com> writes: >> > > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote: >> > > > Stefan Hajnoczi <stefanha@gmail.com> writes: >> > > > >> > > > > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote: >> > > > -------------- >> > > > | Conclusion | >> > > > -------------- >> > > > >> > > > The average boot time of microvm is a third of Q35's (115ms vs. >> > > > 363ms), >> > > > and is smaller on all sections (QEMU initialization, firmware >> > > > overhead >> > > > and kernel start-to-user). >> > > > >> > > > Microvm's memory tree is also visibly simpler, significantly >> > > > reducing >> > > > the exposed surface to the guest. >> > > > >> > > > While we can certainly work on making Q35 smaller, I definitely >> > > > think >> > > > it's better (and way safer!) having a specialized machine type >> > > > for a >> > > > specific use case, than a minimal Q35 whose behavior >> > > > significantly >> > > > diverges from a conventional Q35. >> > > >> > > Interesting, so not a 10x difference! This might be amenable to >> > > optimization. >> > > >> > > My concern with microvm is that it's so limited that few users >> > > will be >> > > able to benefit from the reduced attack surface and faster >> > > startup time. >> > > I think it's worth investigating slimming down Q35 further first. >> > > >> > > In terms of startup time the first step would be profiling Q35 >> > > kernel >> > > startup to find out what's taking so long (firmware >> > > initialization, PCI >> > > probing, etc)? >> > >> > Some findings: >> > >> > 1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host") >> > saves a >> > whooping 120ms by avoiding the APIC timer calibration at >> > arch/x86/kernel/apic/apic.c:calibrate_APIC_clock >> > >> > Average boot time with "-cpu host" >> > qemu_init_end: 76.408950 >> > linux_start_kernel: 116.166142 (+39.757192) >> > linux_start_user: 242.954347 (+126.788205) >> > >> > Average boot time with default "cpu" >> > qemu_init_end: 77.467852 >> > linux_start_kernel: 116.688472 (+39.22062) >> > linux_start_user: 363.033365 (+246.344893) >> >> \o/ >> >> > 2. The other 130ms are a direct result of PCI and ACPI presence >> > (tested >> > with a kernel without support for those elements). I'll publish >> > some >> > detailed numbers next week. >> >> Here are the Kata Containers kernel parameters: >> >> var kernelParams = []Param{ >> {"tsc", "reliable"}, >> {"no_timer_check", ""}, >> {"rcupdate.rcu_expedited", "1"}, >> {"i8042.direct", "1"}, >> {"i8042.dumbkbd", "1"}, >> {"i8042.nopnp", "1"}, >> {"i8042.noaux", "1"}, >> {"noreplace-smp", ""}, >> {"reboot", "k"}, >> {"console", "hvc0"}, >> {"console", "hvc1"}, >> {"iommu", "off"}, >> {"cryptomgr.notests", ""}, >> {"net.ifnames", "0"}, >> {"pci", "lastbus=0"}, >> } >> >> pci lastbus=0 looks interesting and so do some of the others :). >> > > yeah, pci=lastbus=0 is very helpful to reduce the boot time in q35, > kernel won't scan the 255.. buses :) I can confirm that adding pci=lastbus=0 makes a significant improvement. In fact, is the only option from Kata's kernel parameter list that has an impact, probably because the kernel is already quite minimalistic. Average boot time with "-cpu host" and "pci=lastbus=0" qemu_init_end: 73.711569 linux_start_kernel: 113.414311 (+39.702742) linux_start_user: 190.949939 (+77.535628) That's still ~40% slower than microvm, and the breach quickly widens when adding more PCI devices (each one adds 10-15ms), but it's certainly an improvement over the original numbers. On the other hand, there isn't much we can do here from QEMU's perspective, as this is basically Guest OS tuning. Sergio. [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-23 8:43 ` Sergio Lopez @ 2019-07-23 9:47 ` Stefan Hajnoczi 2019-07-23 10:01 ` Paolo Bonzini 2019-07-23 11:30 ` Stefano Garzarella 0 siblings, 2 replies; 68+ messages in thread From: Stefan Hajnoczi @ 2019-07-23 9:47 UTC (permalink / raw) To: Sergio Lopez Cc: ehabkost, mst, Montes, Julio, maran.wilson, qemu-devel, kraxel, pbonzini, rth, sgarzare On Tue, Jul 23, 2019 at 9:43 AM Sergio Lopez <slp@redhat.com> wrote: > Montes, Julio <julio.montes@intel.com> writes: > > > On Fri, 2019-07-19 at 16:09 +0100, Stefan Hajnoczi wrote: > >> On Fri, Jul 19, 2019 at 2:48 PM Sergio Lopez <slp@redhat.com> wrote: > >> > Stefan Hajnoczi <stefanha@gmail.com> writes: > >> > > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote: > >> > > > Stefan Hajnoczi <stefanha@gmail.com> writes: > >> > > > > >> > > > > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote: > >> > > > -------------- > >> > > > | Conclusion | > >> > > > -------------- > >> > > > > >> > > > The average boot time of microvm is a third of Q35's (115ms vs. > >> > > > 363ms), > >> > > > and is smaller on all sections (QEMU initialization, firmware > >> > > > overhead > >> > > > and kernel start-to-user). > >> > > > > >> > > > Microvm's memory tree is also visibly simpler, significantly > >> > > > reducing > >> > > > the exposed surface to the guest. > >> > > > > >> > > > While we can certainly work on making Q35 smaller, I definitely > >> > > > think > >> > > > it's better (and way safer!) having a specialized machine type > >> > > > for a > >> > > > specific use case, than a minimal Q35 whose behavior > >> > > > significantly > >> > > > diverges from a conventional Q35. > >> > > > >> > > Interesting, so not a 10x difference! This might be amenable to > >> > > optimization. > >> > > > >> > > My concern with microvm is that it's so limited that few users > >> > > will be > >> > > able to benefit from the reduced attack surface and faster > >> > > startup time. > >> > > I think it's worth investigating slimming down Q35 further first. > >> > > > >> > > In terms of startup time the first step would be profiling Q35 > >> > > kernel > >> > > startup to find out what's taking so long (firmware > >> > > initialization, PCI > >> > > probing, etc)? > >> > > >> > Some findings: > >> > > >> > 1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host") > >> > saves a > >> > whooping 120ms by avoiding the APIC timer calibration at > >> > arch/x86/kernel/apic/apic.c:calibrate_APIC_clock > >> > > >> > Average boot time with "-cpu host" > >> > qemu_init_end: 76.408950 > >> > linux_start_kernel: 116.166142 (+39.757192) > >> > linux_start_user: 242.954347 (+126.788205) > >> > > >> > Average boot time with default "cpu" > >> > qemu_init_end: 77.467852 > >> > linux_start_kernel: 116.688472 (+39.22062) > >> > linux_start_user: 363.033365 (+246.344893) > >> > >> \o/ > >> > >> > 2. The other 130ms are a direct result of PCI and ACPI presence > >> > (tested > >> > with a kernel without support for those elements). I'll publish > >> > some > >> > detailed numbers next week. > >> > >> Here are the Kata Containers kernel parameters: > >> > >> var kernelParams = []Param{ > >> {"tsc", "reliable"}, > >> {"no_timer_check", ""}, > >> {"rcupdate.rcu_expedited", "1"}, > >> {"i8042.direct", "1"}, > >> {"i8042.dumbkbd", "1"}, > >> {"i8042.nopnp", "1"}, > >> {"i8042.noaux", "1"}, > >> {"noreplace-smp", ""}, > >> {"reboot", "k"}, > >> {"console", "hvc0"}, > >> {"console", "hvc1"}, > >> {"iommu", "off"}, > >> {"cryptomgr.notests", ""}, > >> {"net.ifnames", "0"}, > >> {"pci", "lastbus=0"}, > >> } > >> > >> pci lastbus=0 looks interesting and so do some of the others :). > >> > > > > yeah, pci=lastbus=0 is very helpful to reduce the boot time in q35, > > kernel won't scan the 255.. buses :) > > I can confirm that adding pci=lastbus=0 makes a significant > improvement. In fact, is the only option from Kata's kernel parameter > list that has an impact, probably because the kernel is already quite > minimalistic. > > Average boot time with "-cpu host" and "pci=lastbus=0" > qemu_init_end: 73.711569 > linux_start_kernel: 113.414311 (+39.702742) > linux_start_user: 190.949939 (+77.535628) > > That's still ~40% slower than microvm, and the breach quickly widens > when adding more PCI devices (each one adds 10-15ms), but it's certainly > an improvement over the original numbers. > > On the other hand, there isn't much we can do here from QEMU's > perspective, as this is basically Guest OS tuning. fw_cfg could expose this information so guest kernels know when to stop enumerating the PCI bus. This would make all PCI guests with new kernels boot ~50 ms faster, regardless of machine type. The difference between microvm and tuned Q35 is 76 ms now. microvm: qemu_init_end: 64.043264 linux_start_kernel: 65.481782 (+1.438518) linux_start_user: 114.938353 (+49.456571) Q35 with -cpu host and pci=lasbus=0: qemu_init_end: 73.711569 linux_start_kernel: 113.414311 (+39.702742) linux_start_user: 190.949939 (+77.535628) There is a ~39 ms difference before linux_start_kernel. SeaBIOS is loading the PVH Option ROM. Stefano: any recommendations for profiling or tuning SeaBIOS? Stefan ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-23 9:47 ` Stefan Hajnoczi @ 2019-07-23 10:01 ` Paolo Bonzini 2019-07-24 11:14 ` Paolo Bonzini 2019-07-23 11:30 ` Stefano Garzarella 1 sibling, 1 reply; 68+ messages in thread From: Paolo Bonzini @ 2019-07-23 10:01 UTC (permalink / raw) To: Stefan Hajnoczi, Sergio Lopez Cc: ehabkost, mst, Montes, Julio, maran.wilson, qemu-devel, kraxel, rth, sgarzare On 23/07/19 11:47, Stefan Hajnoczi wrote: > fw_cfg could expose this information so guest kernels know when to > stop enumerating the PCI bus. This would make all PCI guests with new > kernels boot ~50 ms faster, regardless of machine type. The number of buses is determined by the firmware, not by QEMU, so fw_cfg would not be the right interface. In fact (as I have just learnt) lastbus is an x86-specific option that overrides the last bus returned by SeaBIOS's handle_1ab101. So the next step could be to figure out what is the lastbus returned by handle_1ab101 and possibly why it isn't zero. Paolo > The difference between microvm and tuned Q35 is 76 ms now. > > microvm: > qemu_init_end: 64.043264 > linux_start_kernel: 65.481782 (+1.438518) > linux_start_user: 114.938353 (+49.456571) > > Q35 with -cpu host and pci=lasbus=0: > qemu_init_end: 73.711569 > linux_start_kernel: 113.414311 (+39.702742) > linux_start_user: 190.949939 (+77.535628) > > There is a ~39 ms difference before linux_start_kernel. SeaBIOS is > loading the PVH Option ROM. > > Stefano: any recommendations for profiling or tuning SeaBIOS? ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-23 10:01 ` Paolo Bonzini @ 2019-07-24 11:14 ` Paolo Bonzini 2019-07-25 9:35 ` Sergio Lopez ` (2 more replies) 0 siblings, 3 replies; 68+ messages in thread From: Paolo Bonzini @ 2019-07-24 11:14 UTC (permalink / raw) To: Stefan Hajnoczi, Sergio Lopez Cc: ehabkost, mst, Montes, Julio, maran.wilson, qemu-devel, kraxel, rth, sgarzare On 23/07/19 12:01, Paolo Bonzini wrote: > The number of buses is determined by the firmware, not by QEMU, so > fw_cfg would not be the right interface. In fact (as I have just > learnt) lastbus is an x86-specific option that overrides the last bus > returned by SeaBIOS's handle_1ab101. > > So the next step could be to figure out what is the lastbus returned by > handle_1ab101 and possibly why it isn't zero. Some update: - for 64-bit, PCIBIOS (and thus handle_1ab101) is not called. PCIBIOS is only used by 32-bit kernels. As a side effect, PCI expander bridges do not work on 32-bit kernels with ACPI disabled, because they are located beyond pcibios_last_bus (with ACPI enabled, the DSDT exposes them). - for -M pc, pcibios_last_bus in Linux remains -1 and no "legacy scanning" is done. - for -M q35, pcibios_last_bus in Linux is set based on the size of the MMCONFIG aperture and Linux ends up scanning all 32*255 (bus,dev) pairs for buses above 0. Here is a patch that only scans devfn==0, which should mostly remove the need for pci=lastbus=0. (Testing is welcome). Actually, KVM could probably avoid the scanning altogether. The only "hidden" root buses we expect are from PCI expander bridges and if you found an MMCONFIG area through the ACPI MCFG table, you can also use the DSDT to find PCI expander bridges. However, I am being conservative. A possible alternative could be a mechanism whereby the vmlinuz real mode entry point, or the 32-bit PVH entry point, fetch lastbus and they pass it to the kernel via the vmlinuz or PVH boot information structs. However, I don't think that's very useful, and there is some risk of breaking real hardware too. Paolo diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h index 73bb404f4d2a..17012aa60d22 100644 --- a/arch/x86/include/asm/pci_x86.h +++ b/arch/x86/include/asm/pci_x86.h @@ -61,6 +61,7 @@ enum pci_bf_sort_state { extern struct pci_ops pci_root_ops; void pcibios_scan_specific_bus(int busn); +void pcibios_scan_bus_by_device(int busn); /* pci-irq.c */ @@ -216,8 +217,10 @@ static inline void mmio_config_writel(void __iomem *pos, u32 val) # endif # define x86_default_pci_init_irq pcibios_irq_init # define x86_default_pci_fixup_irqs pcibios_fixup_irqs +# define x86_default_pci_scan_bus pcibios_scan_bus_by_device #else # define x86_default_pci_init NULL # define x86_default_pci_init_irq NULL # define x86_default_pci_fixup_irqs NULL +# define x86_default_pci_scan_bus NULL #endif diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h index b85a7c54c6a1..4c3a0a17a600 100644 --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -251,6 +251,7 @@ struct x86_hyper_runtime { * @save_sched_clock_state: save state for sched_clock() on suspend * @restore_sched_clock_state: restore state for sched_clock() on resume * @apic_post_init: adjust apic if needed + * @pci_scan_bus: scan a PCI bus * @legacy: legacy features * @set_legacy_features: override legacy features. Use of this callback * is highly discouraged. You should only need @@ -273,6 +274,7 @@ struct x86_platform_ops { void (*save_sched_clock_state)(void); void (*restore_sched_clock_state)(void); void (*apic_post_init)(void); + void (*pci_scan_bus)(int busn); struct x86_legacy_features legacy; void (*set_legacy_features)(void); struct x86_hyper_runtime hyper; diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c index 6857b4577f17..b248d7036dd3 100644 --- a/arch/x86/kernel/jailhouse.c +++ b/arch/x86/kernel/jailhouse.c @@ -11,12 +11,14 @@ #include <linux/acpi_pmtmr.h> #include <linux/kernel.h> #include <linux/reboot.h> +#include <linux/pci.h> #include <asm/apic.h> #include <asm/cpu.h> #include <asm/hypervisor.h> #include <asm/i8259.h> #include <asm/irqdomain.h> #include <asm/pci_x86.h> +#include <asm/pci.h> #include <asm/reboot.h> #include <asm/setup.h> #include <asm/jailhouse_para.h> @@ -136,6 +138,22 @@ static int __init jailhouse_pci_arch_init(void) return 0; } +static void jailhouse_pci_scan_bus_by_function(int busn) +{ + int devfn; + u32 l; + + for (devfn = 0; devfn < 256; devfn++) { + if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, &l) && + l != 0x0000 && l != 0xffff) { + DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l); + pr_info("PCI: Discovered peer bus %02x\n", busn); + pcibios_scan_root(busn); + return; + } + } +} + static void __init jailhouse_init_platform(void) { u64 pa_data = boot_params.hdr.setup_data; @@ -153,6 +171,7 @@ static void __init jailhouse_init_platform(void) x86_platform.legacy.rtc = 0; x86_platform.legacy.warm_reset = 0; x86_platform.legacy.i8042 = X86_LEGACY_I8042_PLATFORM_ABSENT; + x86_platform.pci_scan_bus = jailhouse_pci_scan_bus_by_function; legacy_pic = &null_legacy_pic; diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 82caf01b63dd..59f7204ed8f3 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -24,6 +24,7 @@ #include <linux/debugfs.h> #include <linux/nmi.h> #include <linux/swait.h> +#include <linux/pci.h> #include <asm/timer.h> #include <asm/cpu.h> #include <asm/traps.h> @@ -33,6 +34,7 @@ #include <asm/apicdef.h> #include <asm/hypervisor.h> #include <asm/tlb.h> +#include <asm/pci.h> static int kvmapf = 1; @@ -621,10 +623,31 @@ static void kvm_flush_tlb_others(const struct cpumask *cpumask, native_flush_tlb_others(flushmask, info); } +#ifdef CONFIG_PCI +static void kvm_pci_scan_bus(int busn) +{ + u32 l; + + /* + * Assume that there are no "hidden" buses, i.e. all PCI root buses + * have a host bridge at device 0, function 0. + */ + if (!raw_pci_read(0, busn, 0, PCI_VENDOR_ID, 2, &l) && + l != 0x0000 && l != 0xffff) { + pr_info("PCI: Discovered peer bus %02x\n", busn); + pcibios_scan_root(busn); + } +} +#endif + static void __init kvm_guest_init(void) { int i; +#ifdef CONFIG_PCI + x86_platform.pci_scan_bus = kvm_pci_scan_bus; +#endif + if (!kvm_para_available()) return; diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c index 50a2b492fdd6..19e1cc2cb6e0 100644 --- a/arch/x86/kernel/x86_init.c +++ b/arch/x86/kernel/x86_init.c @@ -118,6 +118,7 @@ struct x86_platform_ops x86_platform __ro_after_init = { .get_nmi_reason = default_get_nmi_reason, .save_sched_clock_state = tsc_save_sched_clock_state, .restore_sched_clock_state = tsc_restore_sched_clock_state, + .pci_scan_bus = x86_default_pci_scan_bus, .hyper.pin_vcpu = x86_op_int_noop, }; diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c index 467311b1eeea..6214dbce26d3 100644 --- a/arch/x86/pci/legacy.c +++ b/arch/x86/pci/legacy.c @@ -36,14 +36,19 @@ int __init pci_legacy_init(void) void pcibios_scan_specific_bus(int busn) { - int stride = jailhouse_paravirt() ? 1 : 8; - int devfn; - u32 l; - if (pci_find_bus(0, busn)) return; - for (devfn = 0; devfn < 256; devfn += stride) { + x86_platform.pci_scan_bus(busn); +} +EXPORT_SYMBOL_GPL(pcibios_scan_specific_bus); + +void pcibios_scan_bus_by_device(int busn) +{ + int devfn; + u32 l; + + for (devfn = 0; devfn < 256; devfn += 8) { if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, &l) && l != 0x0000 && l != 0xffff) { DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l); @@ -53,7 +58,6 @@ void pcibios_scan_specific_bus(int busn) } } } -EXPORT_SYMBOL_GPL(pcibios_scan_specific_bus); static int __init pci_subsys_init(void) { ^ permalink raw reply related [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-24 11:14 ` Paolo Bonzini @ 2019-07-25 9:35 ` Sergio Lopez 2019-07-25 10:03 ` Michael S. Tsirkin 2019-07-25 14:46 ` Michael S. Tsirkin 2 siblings, 0 replies; 68+ messages in thread From: Sergio Lopez @ 2019-07-25 9:35 UTC (permalink / raw) To: Paolo Bonzini Cc: ehabkost, mst, Montes, Julio, Stefan Hajnoczi, maran.wilson, qemu-devel, kraxel, rth, sgarzare [-- Attachment #1: Type: text/plain, Size: 9149 bytes --] Paolo Bonzini <pbonzini@redhat.com> writes: > On 23/07/19 12:01, Paolo Bonzini wrote: >> The number of buses is determined by the firmware, not by QEMU, so >> fw_cfg would not be the right interface. In fact (as I have just >> learnt) lastbus is an x86-specific option that overrides the last bus >> returned by SeaBIOS's handle_1ab101. >> >> So the next step could be to figure out what is the lastbus returned by >> handle_1ab101 and possibly why it isn't zero. > > Some update: > > - for 64-bit, PCIBIOS (and thus handle_1ab101) is not called. PCIBIOS is > only used by 32-bit kernels. As a side effect, PCI expander bridges do not > work on 32-bit kernels with ACPI disabled, because they are located beyond > pcibios_last_bus (with ACPI enabled, the DSDT exposes them). > > - for -M pc, pcibios_last_bus in Linux remains -1 and no "legacy scanning" is done. > > - for -M q35, pcibios_last_bus in Linux is set based on the size of the > MMCONFIG aperture and Linux ends up scanning all 32*255 (bus,dev) pairs > for buses above 0. > > Here is a patch that only scans devfn==0, which should mostly remove the need > for pci=lastbus=0. (Testing is welcome). I just gave it a try. These are the results (avg on 10 consecutive runs): - Unpatched kernel: Avg qemu_init_end: 75.207386 linux_start_kernel: 115.056767 (+39.849381) linux_start_user: 241.020113 (+125.963346) - Unpatched kernel with pci=lastbus=0: Avg qemu_init_end: 75.468282 linux_start_kernel: 115.189322 (+39.72104) linux_start_user: 192.404823 (+77.215501) - Patched kernel (without pci=lastbus=0): Avg qemu_init_end: 75.605627 linux_start_kernel: 115.656557 (+40.05093) linux_start_user: 192.857655 (+77.201098) Looks fine to me. There must an extra cost in the patched kernel vs. using pci=lastbus=0, but it's so low that's hard to catch on the average numbers. > Actually, KVM could probably avoid the scanning altogether. The only "hidden" root > buses we expect are from PCI expander bridges and if you found an MMCONFIG area > through the ACPI MCFG table, you can also use the DSDT to find PCI expander bridges. > However, I am being conservative. > > A possible alternative could be a mechanism whereby the vmlinuz real mode entry > point, or the 32-bit PVH entry point, fetch lastbus and they pass it to the > kernel via the vmlinuz or PVH boot information structs. However, I don't think > that's very useful, and there is some risk of breaking real hardware too. > > Paolo > > diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h > index 73bb404f4d2a..17012aa60d22 100644 > --- a/arch/x86/include/asm/pci_x86.h > +++ b/arch/x86/include/asm/pci_x86.h > @@ -61,6 +61,7 @@ enum pci_bf_sort_state { > extern struct pci_ops pci_root_ops; > > void pcibios_scan_specific_bus(int busn); > +void pcibios_scan_bus_by_device(int busn); > > /* pci-irq.c */ > > @@ -216,8 +217,10 @@ static inline void mmio_config_writel(void __iomem *pos, u32 val) > # endif > # define x86_default_pci_init_irq pcibios_irq_init > # define x86_default_pci_fixup_irqs pcibios_fixup_irqs > +# define x86_default_pci_scan_bus pcibios_scan_bus_by_device > #else > # define x86_default_pci_init NULL > # define x86_default_pci_init_irq NULL > # define x86_default_pci_fixup_irqs NULL > +# define x86_default_pci_scan_bus NULL > #endif > diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h > index b85a7c54c6a1..4c3a0a17a600 100644 > --- a/arch/x86/include/asm/x86_init.h > +++ b/arch/x86/include/asm/x86_init.h > @@ -251,6 +251,7 @@ struct x86_hyper_runtime { > * @save_sched_clock_state: save state for sched_clock() on suspend > * @restore_sched_clock_state: restore state for sched_clock() on resume > * @apic_post_init: adjust apic if needed > + * @pci_scan_bus: scan a PCI bus > * @legacy: legacy features > * @set_legacy_features: override legacy features. Use of this callback > * is highly discouraged. You should only need > @@ -273,6 +274,7 @@ struct x86_platform_ops { > void (*save_sched_clock_state)(void); > void (*restore_sched_clock_state)(void); > void (*apic_post_init)(void); > + void (*pci_scan_bus)(int busn); > struct x86_legacy_features legacy; > void (*set_legacy_features)(void); > struct x86_hyper_runtime hyper; > diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c > index 6857b4577f17..b248d7036dd3 100644 > --- a/arch/x86/kernel/jailhouse.c > +++ b/arch/x86/kernel/jailhouse.c > @@ -11,12 +11,14 @@ > #include <linux/acpi_pmtmr.h> > #include <linux/kernel.h> > #include <linux/reboot.h> > +#include <linux/pci.h> > #include <asm/apic.h> > #include <asm/cpu.h> > #include <asm/hypervisor.h> > #include <asm/i8259.h> > #include <asm/irqdomain.h> > #include <asm/pci_x86.h> > +#include <asm/pci.h> > #include <asm/reboot.h> > #include <asm/setup.h> > #include <asm/jailhouse_para.h> > @@ -136,6 +138,22 @@ static int __init jailhouse_pci_arch_init(void) > return 0; > } > > +static void jailhouse_pci_scan_bus_by_function(int busn) > +{ > + int devfn; > + u32 l; > + > + for (devfn = 0; devfn < 256; devfn++) { > + if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, &l) && > + l != 0x0000 && l != 0xffff) { > + DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l); > + pr_info("PCI: Discovered peer bus %02x\n", busn); > + pcibios_scan_root(busn); > + return; > + } > + } > +} > + > static void __init jailhouse_init_platform(void) > { > u64 pa_data = boot_params.hdr.setup_data; > @@ -153,6 +171,7 @@ static void __init jailhouse_init_platform(void) > x86_platform.legacy.rtc = 0; > x86_platform.legacy.warm_reset = 0; > x86_platform.legacy.i8042 = X86_LEGACY_I8042_PLATFORM_ABSENT; > + x86_platform.pci_scan_bus = jailhouse_pci_scan_bus_by_function; > > legacy_pic = &null_legacy_pic; > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > index 82caf01b63dd..59f7204ed8f3 100644 > --- a/arch/x86/kernel/kvm.c > +++ b/arch/x86/kernel/kvm.c > @@ -24,6 +24,7 @@ > #include <linux/debugfs.h> > #include <linux/nmi.h> > #include <linux/swait.h> > +#include <linux/pci.h> > #include <asm/timer.h> > #include <asm/cpu.h> > #include <asm/traps.h> > @@ -33,6 +34,7 @@ > #include <asm/apicdef.h> > #include <asm/hypervisor.h> > #include <asm/tlb.h> > +#include <asm/pci.h> > > static int kvmapf = 1; > > @@ -621,10 +623,31 @@ static void kvm_flush_tlb_others(const struct cpumask *cpumask, > native_flush_tlb_others(flushmask, info); > } > > +#ifdef CONFIG_PCI > +static void kvm_pci_scan_bus(int busn) > +{ > + u32 l; > + > + /* > + * Assume that there are no "hidden" buses, i.e. all PCI root buses > + * have a host bridge at device 0, function 0. > + */ > + if (!raw_pci_read(0, busn, 0, PCI_VENDOR_ID, 2, &l) && > + l != 0x0000 && l != 0xffff) { > + pr_info("PCI: Discovered peer bus %02x\n", busn); > + pcibios_scan_root(busn); > + } > +} > +#endif > + > static void __init kvm_guest_init(void) > { > int i; > > +#ifdef CONFIG_PCI > + x86_platform.pci_scan_bus = kvm_pci_scan_bus; > +#endif > + > if (!kvm_para_available()) > return; > > diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c > index 50a2b492fdd6..19e1cc2cb6e0 100644 > --- a/arch/x86/kernel/x86_init.c > +++ b/arch/x86/kernel/x86_init.c > @@ -118,6 +118,7 @@ struct x86_platform_ops x86_platform __ro_after_init = { > .get_nmi_reason = default_get_nmi_reason, > .save_sched_clock_state = tsc_save_sched_clock_state, > .restore_sched_clock_state = tsc_restore_sched_clock_state, > + .pci_scan_bus = x86_default_pci_scan_bus, > .hyper.pin_vcpu = x86_op_int_noop, > }; > > diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c > index 467311b1eeea..6214dbce26d3 100644 > --- a/arch/x86/pci/legacy.c > +++ b/arch/x86/pci/legacy.c > @@ -36,14 +36,19 @@ int __init pci_legacy_init(void) > > void pcibios_scan_specific_bus(int busn) > { > - int stride = jailhouse_paravirt() ? 1 : 8; > - int devfn; > - u32 l; > - > if (pci_find_bus(0, busn)) > return; > > - for (devfn = 0; devfn < 256; devfn += stride) { > + x86_platform.pci_scan_bus(busn); > +} > +EXPORT_SYMBOL_GPL(pcibios_scan_specific_bus); > + > +void pcibios_scan_bus_by_device(int busn) > +{ > + int devfn; > + u32 l; > + > + for (devfn = 0; devfn < 256; devfn += 8) { > if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, &l) && > l != 0x0000 && l != 0xffff) { > DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l); > @@ -53,7 +58,6 @@ void pcibios_scan_specific_bus(int busn) > } > } > } > -EXPORT_SYMBOL_GPL(pcibios_scan_specific_bus); > > static int __init pci_subsys_init(void) > { [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-24 11:14 ` Paolo Bonzini 2019-07-25 9:35 ` Sergio Lopez @ 2019-07-25 10:03 ` Michael S. Tsirkin 2019-07-25 10:55 ` Paolo Bonzini 2019-07-25 14:46 ` Michael S. Tsirkin 2 siblings, 1 reply; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 10:03 UTC (permalink / raw) To: Paolo Bonzini Cc: ehabkost, Sergio Lopez, maran.wilson, Montes, Julio, Stefan Hajnoczi, qemu-devel, kraxel, rth, sgarzare On Wed, Jul 24, 2019 at 01:14:35PM +0200, Paolo Bonzini wrote: > On 23/07/19 12:01, Paolo Bonzini wrote: > > The number of buses is determined by the firmware, not by QEMU, so > > fw_cfg would not be the right interface. In fact (as I have just > > learnt) lastbus is an x86-specific option that overrides the last bus > > returned by SeaBIOS's handle_1ab101. > > > > So the next step could be to figure out what is the lastbus returned by > > handle_1ab101 and possibly why it isn't zero. > > Some update: > > - for 64-bit, PCIBIOS (and thus handle_1ab101) is not called. PCIBIOS is > only used by 32-bit kernels. As a side effect, PCI expander bridges do not > work on 32-bit kernels with ACPI disabled, because they are located beyond > pcibios_last_bus (with ACPI enabled, the DSDT exposes them). > > - for -M pc, pcibios_last_bus in Linux remains -1 and no "legacy scanning" is done. > > - for -M q35, pcibios_last_bus in Linux is set based on the size of the > MMCONFIG aperture and Linux ends up scanning all 32*255 (bus,dev) pairs > for buses above 0. > > Here is a patch that only scans devfn==0, which should mostly remove the need > for pci=lastbus=0. (Testing is welcome). > > Actually, KVM could probably avoid the scanning altogether. The only "hidden" root > buses we expect are from PCI expander bridges and if you found an MMCONFIG area > through the ACPI MCFG table, you can also use the DSDT to find PCI expander bridges. > However, I am being conservative. > > A possible alternative could be a mechanism whereby the vmlinuz real mode entry > point, or the 32-bit PVH entry point, fetch lastbus and they pass it to the > kernel via the vmlinuz or PVH boot information structs. However, I don't think > that's very useful, and there is some risk of breaking real hardware too. > > Paolo > > diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h > index 73bb404f4d2a..17012aa60d22 100644 > --- a/arch/x86/include/asm/pci_x86.h > +++ b/arch/x86/include/asm/pci_x86.h > @@ -61,6 +61,7 @@ enum pci_bf_sort_state { > extern struct pci_ops pci_root_ops; > > void pcibios_scan_specific_bus(int busn); > +void pcibios_scan_bus_by_device(int busn); > > /* pci-irq.c */ > > @@ -216,8 +217,10 @@ static inline void mmio_config_writel(void __iomem *pos, u32 val) > # endif > # define x86_default_pci_init_irq pcibios_irq_init > # define x86_default_pci_fixup_irqs pcibios_fixup_irqs > +# define x86_default_pci_scan_bus pcibios_scan_bus_by_device > #else > # define x86_default_pci_init NULL > # define x86_default_pci_init_irq NULL > # define x86_default_pci_fixup_irqs NULL > +# define x86_default_pci_scan_bus NULL > #endif > diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h > index b85a7c54c6a1..4c3a0a17a600 100644 > --- a/arch/x86/include/asm/x86_init.h > +++ b/arch/x86/include/asm/x86_init.h > @@ -251,6 +251,7 @@ struct x86_hyper_runtime { > * @save_sched_clock_state: save state for sched_clock() on suspend > * @restore_sched_clock_state: restore state for sched_clock() on resume > * @apic_post_init: adjust apic if needed > + * @pci_scan_bus: scan a PCI bus > * @legacy: legacy features > * @set_legacy_features: override legacy features. Use of this callback > * is highly discouraged. You should only need > @@ -273,6 +274,7 @@ struct x86_platform_ops { > void (*save_sched_clock_state)(void); > void (*restore_sched_clock_state)(void); > void (*apic_post_init)(void); > + void (*pci_scan_bus)(int busn); > struct x86_legacy_features legacy; > void (*set_legacy_features)(void); > struct x86_hyper_runtime hyper; > diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c > index 6857b4577f17..b248d7036dd3 100644 > --- a/arch/x86/kernel/jailhouse.c > +++ b/arch/x86/kernel/jailhouse.c > @@ -11,12 +11,14 @@ > #include <linux/acpi_pmtmr.h> > #include <linux/kernel.h> > #include <linux/reboot.h> > +#include <linux/pci.h> > #include <asm/apic.h> > #include <asm/cpu.h> > #include <asm/hypervisor.h> > #include <asm/i8259.h> > #include <asm/irqdomain.h> > #include <asm/pci_x86.h> > +#include <asm/pci.h> > #include <asm/reboot.h> > #include <asm/setup.h> > #include <asm/jailhouse_para.h> > @@ -136,6 +138,22 @@ static int __init jailhouse_pci_arch_init(void) > return 0; > } > > +static void jailhouse_pci_scan_bus_by_function(int busn) > +{ > + int devfn; > + u32 l; > + > + for (devfn = 0; devfn < 256; devfn++) { > + if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, &l) && > + l != 0x0000 && l != 0xffff) { > + DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l); > + pr_info("PCI: Discovered peer bus %02x\n", busn); > + pcibios_scan_root(busn); > + return; > + } > + } > +} > + > static void __init jailhouse_init_platform(void) > { > u64 pa_data = boot_params.hdr.setup_data; > @@ -153,6 +171,7 @@ static void __init jailhouse_init_platform(void) > x86_platform.legacy.rtc = 0; > x86_platform.legacy.warm_reset = 0; > x86_platform.legacy.i8042 = X86_LEGACY_I8042_PLATFORM_ABSENT; > + x86_platform.pci_scan_bus = jailhouse_pci_scan_bus_by_function; > > legacy_pic = &null_legacy_pic; > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > index 82caf01b63dd..59f7204ed8f3 100644 > --- a/arch/x86/kernel/kvm.c > +++ b/arch/x86/kernel/kvm.c > @@ -24,6 +24,7 @@ > #include <linux/debugfs.h> > #include <linux/nmi.h> > #include <linux/swait.h> > +#include <linux/pci.h> > #include <asm/timer.h> > #include <asm/cpu.h> > #include <asm/traps.h> > @@ -33,6 +34,7 @@ > #include <asm/apicdef.h> > #include <asm/hypervisor.h> > #include <asm/tlb.h> > +#include <asm/pci.h> > > static int kvmapf = 1; > > @@ -621,10 +623,31 @@ static void kvm_flush_tlb_others(const struct cpumask *cpumask, > native_flush_tlb_others(flushmask, info); > } > > +#ifdef CONFIG_PCI > +static void kvm_pci_scan_bus(int busn) > +{ > + u32 l; > + > + /* > + * Assume that there are no "hidden" buses, i.e. all PCI root buses > + * have a host bridge at device 0, function 0. > + */ > + if (!raw_pci_read(0, busn, 0, PCI_VENDOR_ID, 2, &l) && > + l != 0x0000 && l != 0xffff) { > + pr_info("PCI: Discovered peer bus %02x\n", busn); > + pcibios_scan_root(busn); > + } > +} > +#endif > + > static void __init kvm_guest_init(void) > { > int i; > > +#ifdef CONFIG_PCI > + x86_platform.pci_scan_bus = kvm_pci_scan_bus; > +#endif > + > if (!kvm_para_available()) > return; > Shouldn't this happen after kvm_para_available? In fact, let's add a CPU ID flag for this, so it's easy to tell guest whether to scan extra buses. What do you say? > diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c > index 50a2b492fdd6..19e1cc2cb6e0 100644 > --- a/arch/x86/kernel/x86_init.c > +++ b/arch/x86/kernel/x86_init.c > @@ -118,6 +118,7 @@ struct x86_platform_ops x86_platform __ro_after_init = { > .get_nmi_reason = default_get_nmi_reason, > .save_sched_clock_state = tsc_save_sched_clock_state, > .restore_sched_clock_state = tsc_restore_sched_clock_state, > + .pci_scan_bus = x86_default_pci_scan_bus, > .hyper.pin_vcpu = x86_op_int_noop, > }; > > diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c > index 467311b1eeea..6214dbce26d3 100644 > --- a/arch/x86/pci/legacy.c > +++ b/arch/x86/pci/legacy.c > @@ -36,14 +36,19 @@ int __init pci_legacy_init(void) > > void pcibios_scan_specific_bus(int busn) > { > - int stride = jailhouse_paravirt() ? 1 : 8; > - int devfn; > - u32 l; > - > if (pci_find_bus(0, busn)) > return; > > - for (devfn = 0; devfn < 256; devfn += stride) { > + x86_platform.pci_scan_bus(busn); > +} > +EXPORT_SYMBOL_GPL(pcibios_scan_specific_bus); > + > +void pcibios_scan_bus_by_device(int busn) > +{ > + int devfn; > + u32 l; > + > + for (devfn = 0; devfn < 256; devfn += 8) { > if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, &l) && > l != 0x0000 && l != 0xffff) { > DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l); > @@ -53,7 +58,6 @@ void pcibios_scan_specific_bus(int busn) > } > } > } > -EXPORT_SYMBOL_GPL(pcibios_scan_specific_bus); > > static int __init pci_subsys_init(void) > { ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 10:03 ` Michael S. Tsirkin @ 2019-07-25 10:55 ` Paolo Bonzini 0 siblings, 0 replies; 68+ messages in thread From: Paolo Bonzini @ 2019-07-25 10:55 UTC (permalink / raw) To: Michael S. Tsirkin Cc: ehabkost, Sergio Lopez, maran.wilson, Montes, Julio, Stefan Hajnoczi, qemu-devel, kraxel, sgarzare, rth On 25/07/19 12:03, Michael S. Tsirkin wrote: >> +#ifdef CONFIG_PCI >> + x86_platform.pci_scan_bus = kvm_pci_scan_bus; >> +#endif >> + >> if (!kvm_para_available()) >> return; >> > Shouldn't this happen after kvm_para_available? Actually kvm_para_available() is not needed anymore, since this only runs after kvm_detect() has returned true. > In fact, let's add a CPU ID flag for this, so it's > easy to tell guest whether to scan extra buses. > What do you say? I think it would make it much harder to deploy this, since it relies on having new userspace and new machine types. This patch is basically a reflection of the status quo, which is that there are generally no "hidden" buses on commonly-used KVM userspaces, and even in the weird configurations that have them there is always something at devfn=0. (On real hardware, the only such hidden bus is e.g. 0x7f/0xff, which have a bunch of QPI and MCH-related devices. This is not something you'd have in a virtual machine). Paolo ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-24 11:14 ` Paolo Bonzini 2019-07-25 9:35 ` Sergio Lopez 2019-07-25 10:03 ` Michael S. Tsirkin @ 2019-07-25 14:46 ` Michael S. Tsirkin 2019-07-25 15:35 ` Paolo Bonzini 2 siblings, 1 reply; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 14:46 UTC (permalink / raw) To: Paolo Bonzini Cc: ehabkost, Sergio Lopez, maran.wilson, Montes, Julio, Stefan Hajnoczi, qemu-devel, kraxel, rth, sgarzare On Wed, Jul 24, 2019 at 01:14:35PM +0200, Paolo Bonzini wrote: > On 23/07/19 12:01, Paolo Bonzini wrote: > > The number of buses is determined by the firmware, not by QEMU, so > > fw_cfg would not be the right interface. In fact (as I have just > > learnt) lastbus is an x86-specific option that overrides the last bus > > returned by SeaBIOS's handle_1ab101. > > > > So the next step could be to figure out what is the lastbus returned by > > handle_1ab101 and possibly why it isn't zero. > > Some update: > > - for 64-bit, PCIBIOS (and thus handle_1ab101) is not called. PCIBIOS is > only used by 32-bit kernels. As a side effect, PCI expander bridges do not > work on 32-bit kernels with ACPI disabled, because they are located beyond > pcibios_last_bus (with ACPI enabled, the DSDT exposes them). > > - for -M pc, pcibios_last_bus in Linux remains -1 and no "legacy scanning" is done. > > - for -M q35, pcibios_last_bus in Linux is set based on the size of the > MMCONFIG aperture and Linux ends up scanning all 32*255 (bus,dev) pairs > for buses above 0. > > Here is a patch that only scans devfn==0, which should mostly remove the need > for pci=lastbus=0. (Testing is welcome). Actually, I think I have a better idea. At the moment we just get an exit on these reads and return all-ones. Yes, in theory there could be a UR bit set in a bunch of registers but in practice no one cares about these, and I don't think we implement them. So how about mapping a single page, read-only, and filling it with all-ones? We'll still run the code within linux but it will be free. What do you think? > Actually, KVM could probably avoid the scanning altogether. The only "hidden" root > buses we expect are from PCI expander bridges and if you found an MMCONFIG area > through the ACPI MCFG table, you can also use the DSDT to find PCI expander bridges. > However, I am being conservative. > > A possible alternative could be a mechanism whereby the vmlinuz real mode entry > point, or the 32-bit PVH entry point, fetch lastbus and they pass it to the > kernel via the vmlinuz or PVH boot information structs. However, I don't think > that's very useful, and there is some risk of breaking real hardware too. > > Paolo > > diff --git a/arch/x86/include/asm/pci_x86.h b/arch/x86/include/asm/pci_x86.h > index 73bb404f4d2a..17012aa60d22 100644 > --- a/arch/x86/include/asm/pci_x86.h > +++ b/arch/x86/include/asm/pci_x86.h > @@ -61,6 +61,7 @@ enum pci_bf_sort_state { > extern struct pci_ops pci_root_ops; > > void pcibios_scan_specific_bus(int busn); > +void pcibios_scan_bus_by_device(int busn); > > /* pci-irq.c */ > > @@ -216,8 +217,10 @@ static inline void mmio_config_writel(void __iomem *pos, u32 val) > # endif > # define x86_default_pci_init_irq pcibios_irq_init > # define x86_default_pci_fixup_irqs pcibios_fixup_irqs > +# define x86_default_pci_scan_bus pcibios_scan_bus_by_device > #else > # define x86_default_pci_init NULL > # define x86_default_pci_init_irq NULL > # define x86_default_pci_fixup_irqs NULL > +# define x86_default_pci_scan_bus NULL > #endif > diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h > index b85a7c54c6a1..4c3a0a17a600 100644 > --- a/arch/x86/include/asm/x86_init.h > +++ b/arch/x86/include/asm/x86_init.h > @@ -251,6 +251,7 @@ struct x86_hyper_runtime { > * @save_sched_clock_state: save state for sched_clock() on suspend > * @restore_sched_clock_state: restore state for sched_clock() on resume > * @apic_post_init: adjust apic if needed > + * @pci_scan_bus: scan a PCI bus > * @legacy: legacy features > * @set_legacy_features: override legacy features. Use of this callback > * is highly discouraged. You should only need > @@ -273,6 +274,7 @@ struct x86_platform_ops { > void (*save_sched_clock_state)(void); > void (*restore_sched_clock_state)(void); > void (*apic_post_init)(void); > + void (*pci_scan_bus)(int busn); > struct x86_legacy_features legacy; > void (*set_legacy_features)(void); > struct x86_hyper_runtime hyper; > diff --git a/arch/x86/kernel/jailhouse.c b/arch/x86/kernel/jailhouse.c > index 6857b4577f17..b248d7036dd3 100644 > --- a/arch/x86/kernel/jailhouse.c > +++ b/arch/x86/kernel/jailhouse.c > @@ -11,12 +11,14 @@ > #include <linux/acpi_pmtmr.h> > #include <linux/kernel.h> > #include <linux/reboot.h> > +#include <linux/pci.h> > #include <asm/apic.h> > #include <asm/cpu.h> > #include <asm/hypervisor.h> > #include <asm/i8259.h> > #include <asm/irqdomain.h> > #include <asm/pci_x86.h> > +#include <asm/pci.h> > #include <asm/reboot.h> > #include <asm/setup.h> > #include <asm/jailhouse_para.h> > @@ -136,6 +138,22 @@ static int __init jailhouse_pci_arch_init(void) > return 0; > } > > +static void jailhouse_pci_scan_bus_by_function(int busn) > +{ > + int devfn; > + u32 l; > + > + for (devfn = 0; devfn < 256; devfn++) { > + if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, &l) && > + l != 0x0000 && l != 0xffff) { > + DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l); > + pr_info("PCI: Discovered peer bus %02x\n", busn); > + pcibios_scan_root(busn); > + return; > + } > + } > +} > + > static void __init jailhouse_init_platform(void) > { > u64 pa_data = boot_params.hdr.setup_data; > @@ -153,6 +171,7 @@ static void __init jailhouse_init_platform(void) > x86_platform.legacy.rtc = 0; > x86_platform.legacy.warm_reset = 0; > x86_platform.legacy.i8042 = X86_LEGACY_I8042_PLATFORM_ABSENT; > + x86_platform.pci_scan_bus = jailhouse_pci_scan_bus_by_function; > > legacy_pic = &null_legacy_pic; > > diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c > index 82caf01b63dd..59f7204ed8f3 100644 > --- a/arch/x86/kernel/kvm.c > +++ b/arch/x86/kernel/kvm.c > @@ -24,6 +24,7 @@ > #include <linux/debugfs.h> > #include <linux/nmi.h> > #include <linux/swait.h> > +#include <linux/pci.h> > #include <asm/timer.h> > #include <asm/cpu.h> > #include <asm/traps.h> > @@ -33,6 +34,7 @@ > #include <asm/apicdef.h> > #include <asm/hypervisor.h> > #include <asm/tlb.h> > +#include <asm/pci.h> > > static int kvmapf = 1; > > @@ -621,10 +623,31 @@ static void kvm_flush_tlb_others(const struct cpumask *cpumask, > native_flush_tlb_others(flushmask, info); > } > > +#ifdef CONFIG_PCI > +static void kvm_pci_scan_bus(int busn) > +{ > + u32 l; > + > + /* > + * Assume that there are no "hidden" buses, i.e. all PCI root buses > + * have a host bridge at device 0, function 0. > + */ > + if (!raw_pci_read(0, busn, 0, PCI_VENDOR_ID, 2, &l) && > + l != 0x0000 && l != 0xffff) { > + pr_info("PCI: Discovered peer bus %02x\n", busn); > + pcibios_scan_root(busn); > + } > +} > +#endif > + > static void __init kvm_guest_init(void) > { > int i; > > +#ifdef CONFIG_PCI > + x86_platform.pci_scan_bus = kvm_pci_scan_bus; > +#endif > + > if (!kvm_para_available()) > return; > > diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c > index 50a2b492fdd6..19e1cc2cb6e0 100644 > --- a/arch/x86/kernel/x86_init.c > +++ b/arch/x86/kernel/x86_init.c > @@ -118,6 +118,7 @@ struct x86_platform_ops x86_platform __ro_after_init = { > .get_nmi_reason = default_get_nmi_reason, > .save_sched_clock_state = tsc_save_sched_clock_state, > .restore_sched_clock_state = tsc_restore_sched_clock_state, > + .pci_scan_bus = x86_default_pci_scan_bus, > .hyper.pin_vcpu = x86_op_int_noop, > }; > > diff --git a/arch/x86/pci/legacy.c b/arch/x86/pci/legacy.c > index 467311b1eeea..6214dbce26d3 100644 > --- a/arch/x86/pci/legacy.c > +++ b/arch/x86/pci/legacy.c > @@ -36,14 +36,19 @@ int __init pci_legacy_init(void) > > void pcibios_scan_specific_bus(int busn) > { > - int stride = jailhouse_paravirt() ? 1 : 8; > - int devfn; > - u32 l; > - > if (pci_find_bus(0, busn)) > return; > > - for (devfn = 0; devfn < 256; devfn += stride) { > + x86_platform.pci_scan_bus(busn); > +} > +EXPORT_SYMBOL_GPL(pcibios_scan_specific_bus); > + > +void pcibios_scan_bus_by_device(int busn) > +{ > + int devfn; > + u32 l; > + > + for (devfn = 0; devfn < 256; devfn += 8) { > if (!raw_pci_read(0, busn, devfn, PCI_VENDOR_ID, 2, &l) && > l != 0x0000 && l != 0xffff) { > DBG("Found device at %02x:%02x [%04x]\n", busn, devfn, l); > @@ -53,7 +58,6 @@ void pcibios_scan_specific_bus(int busn) > } > } > } > -EXPORT_SYMBOL_GPL(pcibios_scan_specific_bus); > > static int __init pci_subsys_init(void) > { ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 14:46 ` Michael S. Tsirkin @ 2019-07-25 15:35 ` Paolo Bonzini 2019-07-25 17:33 ` Michael S. Tsirkin 2019-07-25 20:30 ` Michael S. Tsirkin 0 siblings, 2 replies; 68+ messages in thread From: Paolo Bonzini @ 2019-07-25 15:35 UTC (permalink / raw) To: Michael S. Tsirkin Cc: ehabkost, Sergio Lopez, maran.wilson, Montes, Julio, Stefan Hajnoczi, qemu-devel, kraxel, rth, sgarzare On 25/07/19 16:46, Michael S. Tsirkin wrote: > Actually, I think I have a better idea. > At the moment we just get an exit on these reads and return all-ones. > Yes, in theory there could be a UR bit set in a bunch of > registers but in practice no one cares about these, > and I don't think we implement them. > So how about mapping a single page, read-only, and filling it > with all-ones? Yes, that's nice indeed. :) But it does have some cost, in terms of either number of VMAs or QEMU RSS since the MMCONFIG area is large. What breaks if we return all zeroes? Zero is not a valid vendor ID. Paolo ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 15:35 ` Paolo Bonzini @ 2019-07-25 17:33 ` Michael S. Tsirkin 2019-07-25 20:30 ` Michael S. Tsirkin 1 sibling, 0 replies; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 17:33 UTC (permalink / raw) To: Paolo Bonzini Cc: ehabkost, Sergio Lopez, maran.wilson, Montes, Julio, Stefan Hajnoczi, qemu-devel, kraxel, rth, sgarzare On Thu, Jul 25, 2019 at 05:35:01PM +0200, Paolo Bonzini wrote: > On 25/07/19 16:46, Michael S. Tsirkin wrote: > > Actually, I think I have a better idea. > > At the moment we just get an exit on these reads and return all-ones. > > Yes, in theory there could be a UR bit set in a bunch of > > registers but in practice no one cares about these, > > and I don't think we implement them. > > So how about mapping a single page, read-only, and filling it > > with all-ones? > > Yes, that's nice indeed. :) But it does have some cost, in terms of > either number of VMAs or QEMU RSS since the MMCONFIG area is large. > > What breaks if we return all zeroes? Zero is not a valid vendor ID. > > Paolo It isn't but that's not what baremetal does. So there's some risk there ... Why is all zeroes better? We still need to map it, right? -- MST ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 15:35 ` Paolo Bonzini 2019-07-25 17:33 ` Michael S. Tsirkin @ 2019-07-25 20:30 ` Michael S. Tsirkin 2019-07-26 7:57 ` Paolo Bonzini 1 sibling, 1 reply; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-25 20:30 UTC (permalink / raw) To: Paolo Bonzini Cc: ehabkost, Sergio Lopez, maran.wilson, Montes, Julio, Stefan Hajnoczi, qemu-devel, kraxel, rth, sgarzare On Thu, Jul 25, 2019 at 05:35:01PM +0200, Paolo Bonzini wrote: > On 25/07/19 16:46, Michael S. Tsirkin wrote: > > Actually, I think I have a better idea. > > At the moment we just get an exit on these reads and return all-ones. > > Yes, in theory there could be a UR bit set in a bunch of > > registers but in practice no one cares about these, > > and I don't think we implement them. > > So how about mapping a single page, read-only, and filling it > > with all-ones? > > Yes, that's nice indeed. :) But it does have some cost, in terms of > either number of VMAs or QEMU RSS since the MMCONFIG area is large. > > What breaks if we return all zeroes? Zero is not a valid vendor ID. > > Paolo I think I know what you are thinking of doing: map /dev/zero so we get a single VMA but all mapped to a single zero pte? We could start with that, at least as an experiment. Further: - we can limit the amount of fragmentation and simply unmap everything if we exceed a specific limit: with more than X devices it's no longer a lightweight VM anyway :) - we can implement /dev/ones. in fact, we can implement /dev/byteXX for each possible value, the cost will be only 1M on a 4k page system. it might come in handy for e.g. free page hinting: at the moment if guest memory is poisoned we can not unmap it, with this trick we can map it to /dev/byteXX. Note that the kvm memory array is still fragmented. Again, we can fallback on disabling the optimization if there are too many devices. -- MST ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-25 20:30 ` Michael S. Tsirkin @ 2019-07-26 7:57 ` Paolo Bonzini 2019-07-26 11:10 ` Michael S. Tsirkin 0 siblings, 1 reply; 68+ messages in thread From: Paolo Bonzini @ 2019-07-26 7:57 UTC (permalink / raw) To: Michael S. Tsirkin Cc: ehabkost, Sergio Lopez, maran.wilson, Montes, Julio, Stefan Hajnoczi, qemu-devel, kraxel, rth, sgarzare On 25/07/19 22:30, Michael S. Tsirkin wrote: > On Thu, Jul 25, 2019 at 05:35:01PM +0200, Paolo Bonzini wrote: >> On 25/07/19 16:46, Michael S. Tsirkin wrote: >>> Actually, I think I have a better idea. >>> At the moment we just get an exit on these reads and return all-ones. >>> Yes, in theory there could be a UR bit set in a bunch of >>> registers but in practice no one cares about these, >>> and I don't think we implement them. >>> So how about mapping a single page, read-only, and filling it >>> with all-ones? >> >> Yes, that's nice indeed. :) But it does have some cost, in terms of >> either number of VMAs or QEMU RSS since the MMCONFIG area is large. >> >> What breaks if we return all zeroes? Zero is not a valid vendor ID. >> >> Paolo > > I think I know what you are thinking of doing: > map /dev/zero so we get a single VMA but all mapped to > a single zero pte? Yes, exactly. You absolutely need to share the page because the guest could easily touch 32*256 pages just to scan function 0 on every bus and device, even if the VM has just 4 or 5 devices and all of them on the root complex. And that causes fragmentation so you have to map bigger areas. > - we can implement /dev/ones. in fact, we can implement > /dev/byteXX for each possible value, the cost will > be only 1M on a 4k page system. > it might come in handy for e.g. free page hinting: > at the moment if guest memory is poisoned > we can not unmap it, with this trick we can > map it to /dev/byteXX. I also thought of /dev/ones, not sure how it would be accepted. :) Also you cannot map lazily on page fault, otherwise you get a vmexit and it's slow again. So /dev/ones needs to be written to use a huge page, possibly. Paolo ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-26 7:57 ` Paolo Bonzini @ 2019-07-26 11:10 ` Michael S. Tsirkin 0 siblings, 0 replies; 68+ messages in thread From: Michael S. Tsirkin @ 2019-07-26 11:10 UTC (permalink / raw) To: Paolo Bonzini Cc: ehabkost, Sergio Lopez, maran.wilson, Montes, Julio, Stefan Hajnoczi, qemu-devel, kraxel, rth, sgarzare On Fri, Jul 26, 2019 at 09:57:51AM +0200, Paolo Bonzini wrote: > On 25/07/19 22:30, Michael S. Tsirkin wrote: > > On Thu, Jul 25, 2019 at 05:35:01PM +0200, Paolo Bonzini wrote: > >> On 25/07/19 16:46, Michael S. Tsirkin wrote: > >>> Actually, I think I have a better idea. > >>> At the moment we just get an exit on these reads and return all-ones. > >>> Yes, in theory there could be a UR bit set in a bunch of > >>> registers but in practice no one cares about these, > >>> and I don't think we implement them. > >>> So how about mapping a single page, read-only, and filling it > >>> with all-ones? > >> > >> Yes, that's nice indeed. :) But it does have some cost, in terms of > >> either number of VMAs or QEMU RSS since the MMCONFIG area is large. > >> > >> What breaks if we return all zeroes? Zero is not a valid vendor ID. > >> > >> Paolo > > > > I think I know what you are thinking of doing: > > map /dev/zero so we get a single VMA but all mapped to > > a single zero pte? > > Yes, exactly. You absolutely need to share the page because the guest > could easily touch 32*256 pages just to scan function 0 on every bus and > device, even if the VM has just 4 or 5 devices and all of them on the > root complex. And that causes fragmentation so you have to map bigger > areas. > > > - we can implement /dev/ones. in fact, we can implement > > /dev/byteXX for each possible value, the cost will > > be only 1M on a 4k page system. > > it might come in handy for e.g. free page hinting: > > at the moment if guest memory is poisoned > > we can not unmap it, with this trick we can > > map it to /dev/byteXX. > > I also thought of /dev/ones, not sure how it would be accepted. :) Also > you cannot map lazily on page fault, otherwise you get a vmexit and it's > slow again. So /dev/ones needs to be written to use a huge page, possibly. > > Paolo It's not easy to do that - each device gets 4K within MCFG. So what we need then is a kvm option to create an address range - or maybe even a group of address ranges and aggressively map all pages in a group to the same guest page on a fault of one page in the group. -- MST ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-23 9:47 ` Stefan Hajnoczi 2019-07-23 10:01 ` Paolo Bonzini @ 2019-07-23 11:30 ` Stefano Garzarella 2019-07-24 15:23 ` Stefano Garzarella 1 sibling, 1 reply; 68+ messages in thread From: Stefano Garzarella @ 2019-07-23 11:30 UTC (permalink / raw) To: Stefan Hajnoczi Cc: ehabkost, Sergio Lopez, mst, Montes, Julio, maran.wilson, qemu-devel, kraxel, pbonzini, rth On Tue, Jul 23, 2019 at 10:47:39AM +0100, Stefan Hajnoczi wrote: > On Tue, Jul 23, 2019 at 9:43 AM Sergio Lopez <slp@redhat.com> wrote: > > Montes, Julio <julio.montes@intel.com> writes: > > > > > On Fri, 2019-07-19 at 16:09 +0100, Stefan Hajnoczi wrote: > > >> On Fri, Jul 19, 2019 at 2:48 PM Sergio Lopez <slp@redhat.com> wrote: > > >> > Stefan Hajnoczi <stefanha@gmail.com> writes: > > >> > > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote: > > >> > > > Stefan Hajnoczi <stefanha@gmail.com> writes: > > >> > > > > > >> > > > > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote: > > >> > > > -------------- > > >> > > > | Conclusion | > > >> > > > -------------- > > >> > > > > > >> > > > The average boot time of microvm is a third of Q35's (115ms vs. > > >> > > > 363ms), > > >> > > > and is smaller on all sections (QEMU initialization, firmware > > >> > > > overhead > > >> > > > and kernel start-to-user). > > >> > > > > > >> > > > Microvm's memory tree is also visibly simpler, significantly > > >> > > > reducing > > >> > > > the exposed surface to the guest. > > >> > > > > > >> > > > While we can certainly work on making Q35 smaller, I definitely > > >> > > > think > > >> > > > it's better (and way safer!) having a specialized machine type > > >> > > > for a > > >> > > > specific use case, than a minimal Q35 whose behavior > > >> > > > significantly > > >> > > > diverges from a conventional Q35. > > >> > > > > >> > > Interesting, so not a 10x difference! This might be amenable to > > >> > > optimization. > > >> > > > > >> > > My concern with microvm is that it's so limited that few users > > >> > > will be > > >> > > able to benefit from the reduced attack surface and faster > > >> > > startup time. > > >> > > I think it's worth investigating slimming down Q35 further first. > > >> > > > > >> > > In terms of startup time the first step would be profiling Q35 > > >> > > kernel > > >> > > startup to find out what's taking so long (firmware > > >> > > initialization, PCI > > >> > > probing, etc)? > > >> > > > >> > Some findings: > > >> > > > >> > 1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host") > > >> > saves a > > >> > whooping 120ms by avoiding the APIC timer calibration at > > >> > arch/x86/kernel/apic/apic.c:calibrate_APIC_clock > > >> > > > >> > Average boot time with "-cpu host" > > >> > qemu_init_end: 76.408950 > > >> > linux_start_kernel: 116.166142 (+39.757192) > > >> > linux_start_user: 242.954347 (+126.788205) > > >> > > > >> > Average boot time with default "cpu" > > >> > qemu_init_end: 77.467852 > > >> > linux_start_kernel: 116.688472 (+39.22062) > > >> > linux_start_user: 363.033365 (+246.344893) > > >> > > >> \o/ > > >> > > >> > 2. The other 130ms are a direct result of PCI and ACPI presence > > >> > (tested > > >> > with a kernel without support for those elements). I'll publish > > >> > some > > >> > detailed numbers next week. > > >> > > >> Here are the Kata Containers kernel parameters: > > >> > > >> var kernelParams = []Param{ > > >> {"tsc", "reliable"}, > > >> {"no_timer_check", ""}, > > >> {"rcupdate.rcu_expedited", "1"}, > > >> {"i8042.direct", "1"}, > > >> {"i8042.dumbkbd", "1"}, > > >> {"i8042.nopnp", "1"}, > > >> {"i8042.noaux", "1"}, > > >> {"noreplace-smp", ""}, > > >> {"reboot", "k"}, > > >> {"console", "hvc0"}, > > >> {"console", "hvc1"}, > > >> {"iommu", "off"}, > > >> {"cryptomgr.notests", ""}, > > >> {"net.ifnames", "0"}, > > >> {"pci", "lastbus=0"}, > > >> } > > >> > > >> pci lastbus=0 looks interesting and so do some of the others :). > > >> > > > > > > yeah, pci=lastbus=0 is very helpful to reduce the boot time in q35, > > > kernel won't scan the 255.. buses :) > > > > I can confirm that adding pci=lastbus=0 makes a significant > > improvement. In fact, is the only option from Kata's kernel parameter > > list that has an impact, probably because the kernel is already quite > > minimalistic. > > > > Average boot time with "-cpu host" and "pci=lastbus=0" > > qemu_init_end: 73.711569 > > linux_start_kernel: 113.414311 (+39.702742) > > linux_start_user: 190.949939 (+77.535628) > > > > That's still ~40% slower than microvm, and the breach quickly widens > > when adding more PCI devices (each one adds 10-15ms), but it's certainly > > an improvement over the original numbers. > > > > On the other hand, there isn't much we can do here from QEMU's > > perspective, as this is basically Guest OS tuning. > > fw_cfg could expose this information so guest kernels know when to > stop enumerating the PCI bus. This would make all PCI guests with new > kernels boot ~50 ms faster, regardless of machine type. > > The difference between microvm and tuned Q35 is 76 ms now. > > microvm: > qemu_init_end: 64.043264 > linux_start_kernel: 65.481782 (+1.438518) > linux_start_user: 114.938353 (+49.456571) > > Q35 with -cpu host and pci=lasbus=0: > qemu_init_end: 73.711569 > linux_start_kernel: 113.414311 (+39.702742) > linux_start_user: 190.949939 (+77.535628) > > There is a ~39 ms difference before linux_start_kernel. SeaBIOS is > loading the PVH Option ROM. > > Stefano: any recommendations for profiling or tuning SeaBIOS? As I said on IRC, the SeaBIOS image in QEMU is the 1.12.1 and it doesn't include this patch (available in the upstream SeaBIOS) that saves ~10ms: commit 75b42835134553c96f113e5014072c0caf99d092 Author: Stefano Garzarella <sgarzare@redhat.com> Date: Sun Dec 2 14:10:13 2018 +0100 qemu: avoid debug prints if debugcon is not enabled In order to speed up the boot phase, we can check the QEMU debugcon device, and disable the writes if it is not recognized. This patch allow us to save around 10 msec (time measured between SeaBIOS entry point and "linuxboot" entry point) when CONFIG_DEBUG_LEVEL=1 and debugcon is not enabled. Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Signed-off-by: Kevin O'Connor <kevin@koconnor.net> As you said, we should update SeaBIOS for the next QEMU release. For profiling, I have some patches that I used to put trace points in the SeaBIOS code. I'll put them in this repository ASAP: https://github.com/stefano-garzarella/qemu-boot-time ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-23 11:30 ` Stefano Garzarella @ 2019-07-24 15:23 ` Stefano Garzarella 0 siblings, 0 replies; 68+ messages in thread From: Stefano Garzarella @ 2019-07-24 15:23 UTC (permalink / raw) To: Stefan Hajnoczi, Sergio Lopez Cc: ehabkost, mst, Montes, Julio, maran.wilson, qemu-devel, kraxel, pbonzini, rth On Tue, Jul 23, 2019 at 1:30 PM Stefano Garzarella <sgarzare@redhat.com> wrote: > > On Tue, Jul 23, 2019 at 10:47:39AM +0100, Stefan Hajnoczi wrote: > > On Tue, Jul 23, 2019 at 9:43 AM Sergio Lopez <slp@redhat.com> wrote: > > > Montes, Julio <julio.montes@intel.com> writes: > > > > > > > On Fri, 2019-07-19 at 16:09 +0100, Stefan Hajnoczi wrote: > > > >> On Fri, Jul 19, 2019 at 2:48 PM Sergio Lopez <slp@redhat.com> wrote: > > > >> > Stefan Hajnoczi <stefanha@gmail.com> writes: > > > >> > > On Thu, Jul 18, 2019 at 05:21:46PM +0200, Sergio Lopez wrote: > > > >> > > > Stefan Hajnoczi <stefanha@gmail.com> writes: > > > >> > > > > > > >> > > > > On Tue, Jul 02, 2019 at 02:11:02PM +0200, Sergio Lopez wrote: > > > >> > > > -------------- > > > >> > > > | Conclusion | > > > >> > > > -------------- > > > >> > > > > > > >> > > > The average boot time of microvm is a third of Q35's (115ms vs. > > > >> > > > 363ms), > > > >> > > > and is smaller on all sections (QEMU initialization, firmware > > > >> > > > overhead > > > >> > > > and kernel start-to-user). > > > >> > > > > > > >> > > > Microvm's memory tree is also visibly simpler, significantly > > > >> > > > reducing > > > >> > > > the exposed surface to the guest. > > > >> > > > > > > >> > > > While we can certainly work on making Q35 smaller, I definitely > > > >> > > > think > > > >> > > > it's better (and way safer!) having a specialized machine type > > > >> > > > for a > > > >> > > > specific use case, than a minimal Q35 whose behavior > > > >> > > > significantly > > > >> > > > diverges from a conventional Q35. > > > >> > > > > > >> > > Interesting, so not a 10x difference! This might be amenable to > > > >> > > optimization. > > > >> > > > > > >> > > My concern with microvm is that it's so limited that few users > > > >> > > will be > > > >> > > able to benefit from the reduced attack surface and faster > > > >> > > startup time. > > > >> > > I think it's worth investigating slimming down Q35 further first. > > > >> > > > > > >> > > In terms of startup time the first step would be profiling Q35 > > > >> > > kernel > > > >> > > startup to find out what's taking so long (firmware > > > >> > > initialization, PCI > > > >> > > probing, etc)? > > > >> > > > > >> > Some findings: > > > >> > > > > >> > 1. Exposing the TSC_DEADLINE CPU flag (i.e. using "-cpu host") > > > >> > saves a > > > >> > whooping 120ms by avoiding the APIC timer calibration at > > > >> > arch/x86/kernel/apic/apic.c:calibrate_APIC_clock > > > >> > > > > >> > Average boot time with "-cpu host" > > > >> > qemu_init_end: 76.408950 > > > >> > linux_start_kernel: 116.166142 (+39.757192) > > > >> > linux_start_user: 242.954347 (+126.788205) > > > >> > > > > >> > Average boot time with default "cpu" > > > >> > qemu_init_end: 77.467852 > > > >> > linux_start_kernel: 116.688472 (+39.22062) > > > >> > linux_start_user: 363.033365 (+246.344893) > > > >> > > > >> \o/ > > > >> > > > >> > 2. The other 130ms are a direct result of PCI and ACPI presence > > > >> > (tested > > > >> > with a kernel without support for those elements). I'll publish > > > >> > some > > > >> > detailed numbers next week. > > > >> > > > >> Here are the Kata Containers kernel parameters: > > > >> > > > >> var kernelParams = []Param{ > > > >> {"tsc", "reliable"}, > > > >> {"no_timer_check", ""}, > > > >> {"rcupdate.rcu_expedited", "1"}, > > > >> {"i8042.direct", "1"}, > > > >> {"i8042.dumbkbd", "1"}, > > > >> {"i8042.nopnp", "1"}, > > > >> {"i8042.noaux", "1"}, > > > >> {"noreplace-smp", ""}, > > > >> {"reboot", "k"}, > > > >> {"console", "hvc0"}, > > > >> {"console", "hvc1"}, > > > >> {"iommu", "off"}, > > > >> {"cryptomgr.notests", ""}, > > > >> {"net.ifnames", "0"}, > > > >> {"pci", "lastbus=0"}, > > > >> } > > > >> > > > >> pci lastbus=0 looks interesting and so do some of the others :). > > > >> > > > > > > > > yeah, pci=lastbus=0 is very helpful to reduce the boot time in q35, > > > > kernel won't scan the 255.. buses :) > > > > > > I can confirm that adding pci=lastbus=0 makes a significant > > > improvement. In fact, is the only option from Kata's kernel parameter > > > list that has an impact, probably because the kernel is already quite > > > minimalistic. > > > > > > Average boot time with "-cpu host" and "pci=lastbus=0" > > > qemu_init_end: 73.711569 > > > linux_start_kernel: 113.414311 (+39.702742) > > > linux_start_user: 190.949939 (+77.535628) > > > > > > That's still ~40% slower than microvm, and the breach quickly widens > > > when adding more PCI devices (each one adds 10-15ms), but it's certainly > > > an improvement over the original numbers. > > > > > > On the other hand, there isn't much we can do here from QEMU's > > > perspective, as this is basically Guest OS tuning. > > > > fw_cfg could expose this information so guest kernels know when to > > stop enumerating the PCI bus. This would make all PCI guests with new > > kernels boot ~50 ms faster, regardless of machine type. > > > > The difference between microvm and tuned Q35 is 76 ms now. > > > > microvm: > > qemu_init_end: 64.043264 > > linux_start_kernel: 65.481782 (+1.438518) > > linux_start_user: 114.938353 (+49.456571) > > > > Q35 with -cpu host and pci=lasbus=0: > > qemu_init_end: 73.711569 > > linux_start_kernel: 113.414311 (+39.702742) > > linux_start_user: 190.949939 (+77.535628) > > > > There is a ~39 ms difference before linux_start_kernel. SeaBIOS is > > loading the PVH Option ROM. > > > > Stefano: any recommendations for profiling or tuning SeaBIOS? > > As I said on IRC, the SeaBIOS image in QEMU is the 1.12.1 and it doesn't > include this patch (available in the upstream SeaBIOS) that saves ~10ms: > > commit 75b42835134553c96f113e5014072c0caf99d092 > Author: Stefano Garzarella <sgarzare@redhat.com> > Date: Sun Dec 2 14:10:13 2018 +0100 > > qemu: avoid debug prints if debugcon is not enabled > > In order to speed up the boot phase, we can check the QEMU > debugcon device, and disable the writes if it is not recognized. > > This patch allow us to save around 10 msec (time measured > between SeaBIOS entry point and "linuxboot" entry point) > when CONFIG_DEBUG_LEVEL=1 and debugcon is not enabled. > > Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> > Signed-off-by: Kevin O'Connor <kevin@koconnor.net> > > As you said, we should update SeaBIOS for the next QEMU release. > > For profiling, I have some patches that I used to put trace points in > the SeaBIOS code. I'll put them in this repository ASAP: > https://github.com/stefano-garzarella/qemu-boot-time I pushed QEMU (optionrom) and SeaBIOS patches in: https://github.com/stefano-garzarella/qemu-boot-time They can be useful for profiling. Cheers, Stefano ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-07-02 12:11 [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Sergio Lopez ` (7 preceding siblings ...) 2019-07-03 9:58 ` Stefan Hajnoczi @ 2019-08-29 9:02 ` Jing Liu 2019-08-29 15:46 ` Sergio Lopez 8 siblings, 1 reply; 68+ messages in thread From: Jing Liu @ 2019-08-29 9:02 UTC (permalink / raw) To: Sergio Lopez, mst, marcel.apfelbaum, pbonzini, rth, ehabkost, maran.wilson, sgarzare, kraxel Cc: qemu-devel Hi Sergio, The idea is interesting and I tried to launch a guest by your guide but seems failed to me. I tried both legacy and normal modes, but the vncviewer connected and told me that: The vm has no graphic display device. All the screen in vnc is just black. kernel config: CONFIG_KVM_MMIO=y CONFIG_VIRTIO_MMIO=y I don't know if any specified kernel version/patch/config is needed or anything I missed. Could you kindly give some tips? Thanks very much. Jing > A QEMU instance with the microvm machine type can be invoked this way: > > - Normal mode: > > qemu-system-x86_64 -M microvm -m 512m -smp 2 \ > -kernel vmlinux -append "console=hvc0 root=/dev/vda" \ > -nodefaults -no-user-config \ > -chardev pty,id=virtiocon0,server \ > -device virtio-serial-device \ > -device virtconsole,chardev=virtiocon0 \ > -drive id=test,file=test.img,format=raw,if=none \ > -device virtio-blk-device,drive=test \ > -netdev tap,id=tap0,script=no,downscript=no \ > -device virtio-net-device,netdev=tap0 > > - Legacy mode: > > qemu-system-x86_64 -M microvm,legacy -m 512m -smp 2 \ > -kernel vmlinux -append "console=ttyS0 root=/dev/vda" \ > -nodefaults -no-user-config \ > -drive id=test,file=test.img,format=raw,if=none \ > -device virtio-blk-device,drive=test \ > -netdev tap,id=tap0,script=no,downscript=no \ > -device virtio-net-device,netdev=tap0 \ > -serial stdio > ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-08-29 9:02 ` Jing Liu @ 2019-08-29 15:46 ` Sergio Lopez 2019-08-30 4:53 ` Jing Liu 0 siblings, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-08-29 15:46 UTC (permalink / raw) To: Jing Liu Cc: ehabkost, maran.wilson, mst, qemu-devel, kraxel, pbonzini, sgarzare, rth [-- Attachment #1: Type: text/plain, Size: 3127 bytes --] Jing Liu <jing2.liu@linux.intel.com> writes: > Hi Sergio, > > The idea is interesting and I tried to launch a guest by your > guide but seems failed to me. I tried both legacy and normal modes, > but the vncviewer connected and told me that: > The vm has no graphic display device. > All the screen in vnc is just black. The microvm machine type doesn't support any graphics device, so you need to rely on the serial console. > kernel config: > CONFIG_KVM_MMIO=y > CONFIG_VIRTIO_MMIO=y > > I don't know if any specified kernel version/patch/config > is needed or anything I missed. > Could you kindly give some tips? I'm testing it with upstream vanilla Linux. In addition to MMIO, you need to add support for PVH (the next version of this patchset, v4, will support booting from FW, so it'll be possible to use non-PVH ELF kernels and bzImages too). I've just uploaded a working kernel config here: https://gist.github.com/slp/1060ba3aaf708584572ad4109f28c8f9 As for the QEMU command line, something like this should do the trick: ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm,legacy -kernel vmlinux -append "earlyprintk=ttyS0 console=ttyS0 reboot=k panic=1" -nodefaults -no-user-config -nographic -serial stdio If this works, you can move to non-legacy mode with a virtio-console: ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm -kernel vmlinux -append "console=hvc0 reboot=k panic=1" -nodefaults -no-user-config -nographic -serial pty -chardev stdio,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 If is still working, you can try adding some devices too: ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm -kernel vmlinux -append "console=hvc0 reboot=k panic=1 root=/dev/vda" -nodefaults -no-user-config -nographic -serial pty -chardev stdio,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 -netdev user,id=testnet -device virtio-net-device,netdev=testnet -drive id=test,file=alpine-rootfs-x86_64.raw,format=raw,if=none -device virtio-blk-device,drive=test Sergio. > Thanks very much. > Jing > > > >> A QEMU instance with the microvm machine type can be invoked this way: >> >> - Normal mode: >> >> qemu-system-x86_64 -M microvm -m 512m -smp 2 \ >> -kernel vmlinux -append "console=hvc0 root=/dev/vda" \ >> -nodefaults -no-user-config \ >> -chardev pty,id=virtiocon0,server \ >> -device virtio-serial-device \ >> -device virtconsole,chardev=virtiocon0 \ >> -drive id=test,file=test.img,format=raw,if=none \ >> -device virtio-blk-device,drive=test \ >> -netdev tap,id=tap0,script=no,downscript=no \ >> -device virtio-net-device,netdev=tap0 >> >> - Legacy mode: >> >> qemu-system-x86_64 -M microvm,legacy -m 512m -smp 2 \ >> -kernel vmlinux -append "console=ttyS0 root=/dev/vda" \ >> -nodefaults -no-user-config \ >> -drive id=test,file=test.img,format=raw,if=none \ >> -device virtio-blk-device,drive=test \ >> -netdev tap,id=tap0,script=no,downscript=no \ >> -device virtio-net-device,netdev=tap0 \ >> -serial stdio >> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-08-29 15:46 ` Sergio Lopez @ 2019-08-30 4:53 ` Jing Liu 2019-08-30 14:27 ` Sergio Lopez 0 siblings, 1 reply; 68+ messages in thread From: Jing Liu @ 2019-08-30 4:53 UTC (permalink / raw) To: Sergio Lopez Cc: ehabkost, maran.wilson, mst, qemu-devel, kraxel, pbonzini, sgarzare, rth Hi Sergio, On 8/29/2019 11:46 PM, Sergio Lopez wrote: > > Jing Liu <jing2.liu@linux.intel.com> writes: > >> Hi Sergio, >> >> The idea is interesting and I tried to launch a guest by your >> guide but seems failed to me. I tried both legacy and normal modes, >> but the vncviewer connected and told me that: >> The vm has no graphic display device. >> All the screen in vnc is just black. > > The microvm machine type doesn't support any graphics device, so you > need to rely on the serial console. Got it. > >> kernel config: >> CONFIG_KVM_MMIO=y >> CONFIG_VIRTIO_MMIO=y >> >> I don't know if any specified kernel version/patch/config >> is needed or anything I missed. >> Could you kindly give some tips? > > I'm testing it with upstream vanilla Linux. In addition to MMIO, you > need to add support for PVH (the next version of this patchset, v4, will > support booting from FW, so it'll be possible to use non-PVH ELF kernels > and bzImages too). > > I've just uploaded a working kernel config here: > > https://gist.github.com/slp/1060ba3aaf708584572ad4109f28c8f9 > Thanks very much and this config is helpful to me. > As for the QEMU command line, something like this should do the trick: > > ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm,legacy -kernel vmlinux -append "earlyprintk=ttyS0 console=ttyS0 reboot=k panic=1" -nodefaults -no-user-config -nographic -serial stdio > > If this works, you can move to non-legacy mode with a virtio-console: > > ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm -kernel vmlinux -append "console=hvc0 reboot=k panic=1" -nodefaults -no-user-config -nographic -serial pty -chardev stdio,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 > I tried the above two ways and it works now. Thanks! > If is still working, you can try adding some devices too: > > ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm -kernel vmlinux -append "console=hvc0 reboot=k panic=1 root=/dev/vda" -nodefaults -no-user-config -nographic -serial pty -chardev stdio,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 -netdev user,id=testnet -device virtio-net-device,netdev=testnet -drive id=test,file=alpine-rootfs-x86_64.raw,format=raw,if=none -device virtio-blk-device,drive=test > But I'm wondering why the image I used can not be found. root=/dev/vda3 and the same image worked well on normal qemu/guest- config bootup, but didn't work here. The details are, -append "console=hvc0 reboot=k panic=1 root=/dev/vda3 rw rootfstype=ext4" \ [ 0.022784] Key type encrypted registered [ 0.022988] VFS: Cannot open root device "vda3" or unknown-block(254,3): error -6 [ 0.023041] Please append a correct "root=" boot option; here are the available partitions: [ 0.023089] fe00 8946688 vda [ 0.023090] driver: virtio_blk [ 0.023143] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,3) [ 0.023201] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc3 #23 BTW, root=/dev/vda is also tried and didn't work. The dmesg is a little different: [ 0.028050] Key type encrypted registered [ 0.028484] List of all partitions: [ 0.028529] fe00 8946688 vda [ 0.028529] driver: virtio_blk [ 0.028615] No filesystem could mount root, tried: [ 0.028616] ext4 [ 0.028670] [ 0.028712] Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(254,0) I tried another ext4 img but still doesn't work. Is there any limitation of blk image? Could I copy your image for simple test? Thanks in advance, Jing > Sergio. > >> Thanks very much. >> Jing >> >> >> >>> A QEMU instance with the microvm machine type can be invoked this way: >>> >>> - Normal mode: >>> >>> qemu-system-x86_64 -M microvm -m 512m -smp 2 \ >>> -kernel vmlinux -append "console=hvc0 root=/dev/vda" \ >>> -nodefaults -no-user-config \ >>> -chardev pty,id=virtiocon0,server \ >>> -device virtio-serial-device \ >>> -device virtconsole,chardev=virtiocon0 \ >>> -drive id=test,file=test.img,format=raw,if=none \ >>> -device virtio-blk-device,drive=test \ >>> -netdev tap,id=tap0,script=no,downscript=no \ >>> -device virtio-net-device,netdev=tap0 >>> >>> - Legacy mode: >>> >>> qemu-system-x86_64 -M microvm,legacy -m 512m -smp 2 \ >>> -kernel vmlinux -append "console=ttyS0 root=/dev/vda" \ >>> -nodefaults -no-user-config \ >>> -drive id=test,file=test.img,format=raw,if=none \ >>> -device virtio-blk-device,drive=test \ >>> -netdev tap,id=tap0,script=no,downscript=no \ >>> -device virtio-net-device,netdev=tap0 \ >>> -serial stdio >>> > ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-08-30 4:53 ` Jing Liu @ 2019-08-30 14:27 ` Sergio Lopez 2019-09-02 5:43 ` Jing Liu 0 siblings, 1 reply; 68+ messages in thread From: Sergio Lopez @ 2019-08-30 14:27 UTC (permalink / raw) To: Jing Liu Cc: ehabkost, maran.wilson, mst, qemu-devel, kraxel, pbonzini, sgarzare, rth [-- Attachment #1: Type: text/plain, Size: 5688 bytes --] Jing Liu <jing2.liu@linux.intel.com> writes: > Hi Sergio, > > On 8/29/2019 11:46 PM, Sergio Lopez wrote: >> >> Jing Liu <jing2.liu@linux.intel.com> writes: >> >>> Hi Sergio, >>> >>> The idea is interesting and I tried to launch a guest by your >>> guide but seems failed to me. I tried both legacy and normal modes, >>> but the vncviewer connected and told me that: >>> The vm has no graphic display device. >>> All the screen in vnc is just black. >> >> The microvm machine type doesn't support any graphics device, so you >> need to rely on the serial console. > Got it. > >> >>> kernel config: >>> CONFIG_KVM_MMIO=y >>> CONFIG_VIRTIO_MMIO=y >>> >>> I don't know if any specified kernel version/patch/config >>> is needed or anything I missed. >>> Could you kindly give some tips? >> >> I'm testing it with upstream vanilla Linux. In addition to MMIO, you >> need to add support for PVH (the next version of this patchset, v4, will >> support booting from FW, so it'll be possible to use non-PVH ELF kernels >> and bzImages too). >> >> I've just uploaded a working kernel config here: >> >> https://gist.github.com/slp/1060ba3aaf708584572ad4109f28c8f9 >> > Thanks very much and this config is helpful to me. > >> As for the QEMU command line, something like this should do the trick: >> >> ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm,legacy -kernel vmlinux -append "earlyprintk=ttyS0 console=ttyS0 reboot=k panic=1" -nodefaults -no-user-config -nographic -serial stdio >> >> If this works, you can move to non-legacy mode with a virtio-console: >> >> ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm -kernel vmlinux -append "console=hvc0 reboot=k panic=1" -nodefaults -no-user-config -nographic -serial pty -chardev stdio,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 >> > I tried the above two ways and it works now. Thanks! > >> If is still working, you can try adding some devices too: >> >> ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm -kernel vmlinux -append "console=hvc0 reboot=k panic=1 root=/dev/vda" -nodefaults -no-user-config -nographic -serial pty -chardev stdio,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 -netdev user,id=testnet -device virtio-net-device,netdev=testnet -drive id=test,file=alpine-rootfs-x86_64.raw,format=raw,if=none -device virtio-blk-device,drive=test >> > But I'm wondering why the image I used can not be found. > root=/dev/vda3 and the same image worked well on normal qemu/guest- > config bootup, but didn't work here. The details are, > > -append "console=hvc0 reboot=k panic=1 root=/dev/vda3 rw rootfstype=ext4" \ > > [ 0.022784] Key type encrypted registered > [ 0.022988] VFS: Cannot open root device "vda3" or > unknown-block(254,3): error -6 > [ 0.023041] Please append a correct "root=" boot option; here are > the available partitions: > [ 0.023089] fe00 8946688 vda > [ 0.023090] driver: virtio_blk > [ 0.023143] Kernel panic - not syncing: VFS: Unable to mount root > fs on unknown-block(254,3) > [ 0.023201] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc3 #23 > > > BTW, root=/dev/vda is also tried and didn't work. The dmesg is a > little different: > > [ 0.028050] Key type encrypted registered > [ 0.028484] List of all partitions: > [ 0.028529] fe00 8946688 vda > [ 0.028529] driver: virtio_blk > [ 0.028615] No filesystem could mount root, tried: > [ 0.028616] ext4 > [ 0.028670] > [ 0.028712] Kernel panic - not syncing: VFS: Unable to mount root > fs on unknown-block(254,0) > > I tried another ext4 img but still doesn't work. > Is there any limitation of blk image? Could I copy your image for simple > test? The kernel config I posted lacks support for DOS partitions. Adding CONFIG_MSDOS_PARTITION=y should allow you to boot from /dev/vda3. Anyway, in case you also want to try booting from /dev/vda (without partitions), this is the recipe I use to quickly create a minimal rootfs image: # wget http://dl-cdn.alpinelinux.org/alpine/v3.10/releases/x86_64/alpine-minirootfs-3.10.2-x86_64.tar.gz # qemu-img create -f raw alpine-rootfs-x86_64.raw 1G # sudo losetup /dev/loop0 alpine-rootfs-x86_64.raw # sudo mkfs.ext4 /dev/loop0 # sudo mount /dev/loop0 /mnt # sudo tar xpf alpine-minirootfs-3.10.2-x86_64.tar.gz -C /mnt # sudo umount /mnt # sudo losetup -d /dev/loop0 The rootfs will be missing openrc, so you'll need to add "init=/bin/sh" to the command line. Sergio. > Thanks in advance, > Jing > >> Sergio. >> >>> Thanks very much. >>> Jing >>> >>> >>> >>>> A QEMU instance with the microvm machine type can be invoked this way: >>>> >>>> - Normal mode: >>>> >>>> qemu-system-x86_64 -M microvm -m 512m -smp 2 \ >>>> -kernel vmlinux -append "console=hvc0 root=/dev/vda" \ >>>> -nodefaults -no-user-config \ >>>> -chardev pty,id=virtiocon0,server \ >>>> -device virtio-serial-device \ >>>> -device virtconsole,chardev=virtiocon0 \ >>>> -drive id=test,file=test.img,format=raw,if=none \ >>>> -device virtio-blk-device,drive=test \ >>>> -netdev tap,id=tap0,script=no,downscript=no \ >>>> -device virtio-net-device,netdev=tap0 >>>> >>>> - Legacy mode: >>>> >>>> qemu-system-x86_64 -M microvm,legacy -m 512m -smp 2 \ >>>> -kernel vmlinux -append "console=ttyS0 root=/dev/vda" \ >>>> -nodefaults -no-user-config \ >>>> -drive id=test,file=test.img,format=raw,if=none \ >>>> -device virtio-blk-device,drive=test \ >>>> -netdev tap,id=tap0,script=no,downscript=no \ >>>> -device virtio-net-device,netdev=tap0 \ >>>> -serial stdio >>>> >> [-- Attachment #2: signature.asc --] [-- Type: application/pgp-signature, Size: 832 bytes --] ^ permalink raw reply [flat|nested] 68+ messages in thread
* Re: [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type 2019-08-30 14:27 ` Sergio Lopez @ 2019-09-02 5:43 ` Jing Liu 0 siblings, 0 replies; 68+ messages in thread From: Jing Liu @ 2019-09-02 5:43 UTC (permalink / raw) To: Sergio Lopez Cc: ehabkost, mst, maran.wilson, qemu-devel, kraxel, pbonzini, rth, sgarzare On 8/30/2019 10:27 PM, Sergio Lopez wrote: > > Jing Liu <jing2.liu@linux.intel.com> writes: > >> Hi Sergio, >> >> On 8/29/2019 11:46 PM, Sergio Lopez wrote: >>> >>> Jing Liu <jing2.liu@linux.intel.com> writes: >>> >>>> Hi Sergio, >>>> >>>> The idea is interesting and I tried to launch a guest by your >>>> guide but seems failed to me. I tried both legacy and normal modes, >>>> but the vncviewer connected and told me that: >>>> The vm has no graphic display device. >>>> All the screen in vnc is just black. >>> >>> The microvm machine type doesn't support any graphics device, so you >>> need to rely on the serial console. >> Got it. >> >>> >>>> kernel config: >>>> CONFIG_KVM_MMIO=y >>>> CONFIG_VIRTIO_MMIO=y >>>> >>>> I don't know if any specified kernel version/patch/config >>>> is needed or anything I missed. >>>> Could you kindly give some tips? >>> >>> I'm testing it with upstream vanilla Linux. In addition to MMIO, you >>> need to add support for PVH (the next version of this patchset, v4, will >>> support booting from FW, so it'll be possible to use non-PVH ELF kernels >>> and bzImages too). >>> >>> I've just uploaded a working kernel config here: >>> >>> https://gist.github.com/slp/1060ba3aaf708584572ad4109f28c8f9 >>> >> Thanks very much and this config is helpful to me. >> >>> As for the QEMU command line, something like this should do the trick: >>> >>> ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm,legacy -kernel vmlinux -append "earlyprintk=ttyS0 console=ttyS0 reboot=k panic=1" -nodefaults -no-user-config -nographic -serial stdio >>> >>> If this works, you can move to non-legacy mode with a virtio-console: >>> >>> ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm -kernel vmlinux -append "console=hvc0 reboot=k panic=1" -nodefaults -no-user-config -nographic -serial pty -chardev stdio,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 >>> >> I tried the above two ways and it works now. Thanks! >> >>> If is still working, you can try adding some devices too: >>> >>> ./x86_64-softmmu/qemu-system-x86_64 -smp 1 -m 1g -enable-kvm -M microvm -kernel vmlinux -append "console=hvc0 reboot=k panic=1 root=/dev/vda" -nodefaults -no-user-config -nographic -serial pty -chardev stdio,id=virtiocon0,server -device virtio-serial-device -device virtconsole,chardev=virtiocon0 -netdev user,id=testnet -device virtio-net-device,netdev=testnet -drive id=test,file=alpine-rootfs-x86_64.raw,format=raw,if=none -device virtio-blk-device,drive=test >>> >> But I'm wondering why the image I used can not be found. >> root=/dev/vda3 and the same image worked well on normal qemu/guest- >> config bootup, but didn't work here. The details are, >> >> -append "console=hvc0 reboot=k panic=1 root=/dev/vda3 rw rootfstype=ext4" \ >> >> [ 0.022784] Key type encrypted registered >> [ 0.022988] VFS: Cannot open root device "vda3" or >> unknown-block(254,3): error -6 >> [ 0.023041] Please append a correct "root=" boot option; here are >> the available partitions: >> [ 0.023089] fe00 8946688 vda >> [ 0.023090] driver: virtio_blk >> [ 0.023143] Kernel panic - not syncing: VFS: Unable to mount root >> fs on unknown-block(254,3) >> [ 0.023201] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.3.0-rc3 #23 >> >> >> BTW, root=/dev/vda is also tried and didn't work. The dmesg is a >> little different: >> >> [ 0.028050] Key type encrypted registered >> [ 0.028484] List of all partitions: >> [ 0.028529] fe00 8946688 vda >> [ 0.028529] driver: virtio_blk >> [ 0.028615] No filesystem could mount root, tried: >> [ 0.028616] ext4 >> [ 0.028670] >> [ 0.028712] Kernel panic - not syncing: VFS: Unable to mount root >> fs on unknown-block(254,0) >> >> I tried another ext4 img but still doesn't work. >> Is there any limitation of blk image? Could I copy your image for simple >> test? > > The kernel config I posted lacks support for DOS partitions. Adding > CONFIG_MSDOS_PARTITION=y should allow you to boot from /dev/vda3. > > Anyway, in case you also want to try booting from /dev/vda (without > partitions), this is the recipe I use to quickly create a minimal rootfs > image: > > # wget http://dl-cdn.alpinelinux.org/alpine/v3.10/releases/x86_64/alpine-minirootfs-3.10.2-x86_64.tar.gz > # qemu-img create -f raw alpine-rootfs-x86_64.raw 1G > # sudo losetup /dev/loop0 alpine-rootfs-x86_64.raw > # sudo mkfs.ext4 /dev/loop0 > # sudo mount /dev/loop0 /mnt > # sudo tar xpf alpine-minirootfs-3.10.2-x86_64.tar.gz -C /mnt > # sudo umount /mnt > # sudo losetup -d /dev/loop0 > > The rootfs will be missing openrc, so you'll need to add "init=/bin/sh" > to the command line. > Thank you Sergio. I'll try that. Jing > Sergio. > >> Thanks in advance, >> Jing >> >>> Sergio. ^ permalink raw reply [flat|nested] 68+ messages in thread
end of thread, other threads:[~2019-09-02 5:44 UTC | newest] Thread overview: 68+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-07-02 12:11 [Qemu-devel] [PATCH v3 0/4] Introduce the microvm machine type Sergio Lopez 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 1/4] hw/virtio: Factorize virtio-mmio headers Sergio Lopez 2019-07-25 9:46 ` Liam Merwick 2019-07-25 9:58 ` Michael S. Tsirkin 2019-07-25 10:03 ` Peter Maydell 2019-07-25 10:36 ` Paolo Bonzini 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 2/4] hw/i386: Add an Intel MPTable generator Sergio Lopez 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 3/4] hw/i386: Factorize PVH related functions Sergio Lopez 2019-07-23 8:39 ` Liam Merwick 2019-07-02 12:11 ` [Qemu-devel] [PATCH v3 4/4] hw/i386: Introduce the microvm machine type Sergio Lopez 2019-07-02 13:58 ` Gerd Hoffmann 2019-07-25 10:47 ` Paolo Bonzini 2019-07-02 15:01 ` [Qemu-devel] [PATCH v3 0/4] " no-reply 2019-07-02 15:23 ` Peter Maydell 2019-07-02 17:34 ` Sergio Lopez 2019-07-02 18:04 ` Peter Maydell 2019-07-02 22:04 ` Sergio Lopez 2019-07-25 9:59 ` Michael S. Tsirkin 2019-07-25 10:05 ` Peter Maydell 2019-07-25 10:10 ` Michael S. Tsirkin 2019-07-25 14:52 ` Sergio Lopez 2019-07-25 10:42 ` Sergio Lopez 2019-07-25 11:23 ` Paolo Bonzini 2019-07-25 12:01 ` Stefan Hajnoczi 2019-07-25 12:10 ` Michael S. Tsirkin 2019-07-25 13:26 ` Stefan Hajnoczi 2019-07-25 13:43 ` Paolo Bonzini 2019-07-25 13:54 ` Michael S. Tsirkin 2019-07-25 14:13 ` Paolo Bonzini 2019-07-25 14:42 ` Michael S. Tsirkin 2019-07-25 14:04 ` Peter Maydell 2019-07-25 14:26 ` Paolo Bonzini 2019-07-25 14:35 ` Michael S. Tsirkin 2019-07-25 14:42 ` Sergio Lopez 2019-07-25 14:58 ` Michael S. Tsirkin 2019-07-25 15:01 ` Michael S. Tsirkin 2019-07-25 15:39 ` Paolo Bonzini 2019-07-25 17:38 ` Michael S. Tsirkin 2019-07-26 12:46 ` Igor Mammedov 2019-07-25 15:49 ` Sergio Lopez 2019-07-25 13:48 ` Michael S. Tsirkin 2019-07-02 15:30 ` no-reply 2019-07-03 9:58 ` Stefan Hajnoczi 2019-07-18 15:21 ` Sergio Lopez 2019-07-19 10:29 ` Stefan Hajnoczi 2019-07-19 13:48 ` Sergio Lopez 2019-07-19 15:09 ` Stefan Hajnoczi 2019-07-19 15:42 ` Montes, Julio 2019-07-23 8:43 ` Sergio Lopez 2019-07-23 9:47 ` Stefan Hajnoczi 2019-07-23 10:01 ` Paolo Bonzini 2019-07-24 11:14 ` Paolo Bonzini 2019-07-25 9:35 ` Sergio Lopez 2019-07-25 10:03 ` Michael S. Tsirkin 2019-07-25 10:55 ` Paolo Bonzini 2019-07-25 14:46 ` Michael S. Tsirkin 2019-07-25 15:35 ` Paolo Bonzini 2019-07-25 17:33 ` Michael S. Tsirkin 2019-07-25 20:30 ` Michael S. Tsirkin 2019-07-26 7:57 ` Paolo Bonzini 2019-07-26 11:10 ` Michael S. Tsirkin 2019-07-23 11:30 ` Stefano Garzarella 2019-07-24 15:23 ` Stefano Garzarella 2019-08-29 9:02 ` Jing Liu 2019-08-29 15:46 ` Sergio Lopez 2019-08-30 4:53 ` Jing Liu 2019-08-30 14:27 ` Sergio Lopez 2019-09-02 5:43 ` Jing Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).