All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/10] Introduce the microvm machine type
@ 2019-10-04  9:37 Sergio Lopez
  2019-10-04  9:37 ` [PATCH v6 01/10] hw/virtio: Factorize virtio-mmio headers Sergio Lopez
                   ` (13 more replies)
  0 siblings, 14 replies; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

Microvm is a machine type inspired by Firecracker and constructed
after the its machine model.

It's a minimalist machine type without PCI nor ACPI support, designed
for short-lived guests. Microvm also establishes a baseline for
benchmarking and optimizing both QEMU and guest operating systems,
since it is optimized for both boot time and footprint.

---

Changelog
v6:
 - Some style fixes (Philippe Mathieu-Daudé)
 - Fix a documentation bug stating that LAPIC was in userspace (Paolo
   Bonzini)
 - Update Xen HVM code after X86MachineState introduction (Philippe
   Mathieu-Daudé)
 - Rename header guard from QEMU_VIRTIO_MMIO_H to HW_VIRTIO_MMIO_H
   (Philippe Mathieu-Daudé)

v5:
 - Drop unneeded "[PATCH v4 2/8] hw/i386: Factorize e820 related
   functions" (Philippe Mathieu-Daudé)
 - Drop unneeded "[PATCH v4 1/8] hw/i386: Factorize PVH related
   functions" (Stefano Garzarella)
 - Split X86MachineState introduction into smaller patches (Philippe
   Mathieu-Daudé)
 - Change option-roms to x-option-roms and kernel-cmdline to
   auto-kernel-cmdline (Paolo Bonzini)
 - Make i8259 PIT and i8254 PIC optional (Paolo Bonzini)
 - Some fixes to the documentation (Paolo Bonzini)
 - Switch documentation format from txt to rst (Peter Maydell)
 - Move NMI interface to X86_MACHINE (Philippe Mathieu-Daudé, Paolo
   Bonzini)

v4:
 - This is a complete rewrite of the whole patchset, with a focus on
   reusing as much existing code as possible to ease the maintenance burden
   and making the machine type as compatible as possible by default. As
   a result, the number of lines dedicated specifically to microvm is
   383 (code lines measured by "cloc") and, with the default
   configuration, it's now able to boot both PVH ELF images and
   bzImages with either SeaBIOS or qboot.

v3:
  - Add initrd support (thanks Stefano).

v2:
  - Drop "[PATCH 1/4] hw/i386: Factorize CPU routine".
  - Simplify machine definition (thanks Eduardo).
  - Remove use of unneeded NUMA-related callbacks (thanks Eduardo).
  - Add a patch to factorize PVH-related functions.
  - Replace use of Linux's Zero Page with PVH (thanks Maran and Paolo).

---
Sergio Lopez (10):
  hw/virtio: Factorize virtio-mmio headers
  hw/i386/pc: rename functions shared with non-PC machines
  hw/i386/pc: move shared x86 functions to x86.c and export them
  hw/i386: split PCMachineState deriving X86MachineState from it
  hw/i386: make x86.c independent from PCMachineState
  fw_cfg: add "modify" functions for all types
  hw/intc/apic: reject pic ints if isa_pic == NULL
  roms: add microvm-bios (qboot) as binary and git submodule
  docs/microvm.rst: document the new microvm machine type
  hw/i386: Introduce the microvm machine type

 docs/microvm.rst                 |  98 ++++
 default-configs/i386-softmmu.mak |   1 +
 include/hw/i386/microvm.h        |  83 ++++
 include/hw/i386/pc.h             |  28 +-
 include/hw/i386/x86.h            |  94 ++++
 include/hw/nvram/fw_cfg.h        |  42 ++
 include/hw/virtio/virtio-mmio.h  |  73 +++
 hw/acpi/cpu_hotplug.c            |  10 +-
 hw/i386/acpi-build.c             |  29 +-
 hw/i386/amd_iommu.c              |   3 +-
 hw/i386/intel_iommu.c            |   3 +-
 hw/i386/microvm.c                | 574 ++++++++++++++++++++++
 hw/i386/pc.c                     | 780 +++---------------------------
 hw/i386/pc_piix.c                |  46 +-
 hw/i386/pc_q35.c                 |  38 +-
 hw/i386/pc_sysfw.c               |  58 +--
 hw/i386/x86.c                    | 790 +++++++++++++++++++++++++++++++
 hw/i386/xen/xen-hvm.c            |  23 +-
 hw/intc/apic.c                   |   2 +-
 hw/intc/ioapic.c                 |   2 +-
 hw/nvram/fw_cfg.c                |  29 ++
 hw/virtio/virtio-mmio.c          |  48 +-
 .gitmodules                      |   3 +
 hw/i386/Kconfig                  |   4 +
 hw/i386/Makefile.objs            |   2 +
 pc-bios/bios-microvm.bin         | Bin 0 -> 65536 bytes
 roms/Makefile                    |   6 +
 roms/qboot                       |   1 +
 28 files changed, 1963 insertions(+), 907 deletions(-)
 create mode 100644 docs/microvm.rst
 create mode 100644 include/hw/i386/microvm.h
 create mode 100644 include/hw/i386/x86.h
 create mode 100644 include/hw/virtio/virtio-mmio.h
 create mode 100644 hw/i386/microvm.c
 create mode 100644 hw/i386/x86.c
 create mode 100755 pc-bios/bios-microvm.bin
 create mode 160000 roms/qboot

-- 
2.21.0



^ permalink raw reply	[flat|nested] 31+ messages in thread

* [PATCH v6 01/10] hw/virtio: Factorize virtio-mmio headers
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
@ 2019-10-04  9:37 ` Sergio Lopez
  2019-10-04  9:43   ` Philippe Mathieu-Daudé
  2019-10-04  9:37 ` [PATCH v6 02/10] hw/i386/pc: rename functions shared with non-PC machines Sergio Lopez
                   ` (12 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

Put QOM and main struct definition in a separate header file, so it
can be accessed from other components.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 include/hw/virtio/virtio-mmio.h | 73 +++++++++++++++++++++++++++++++++
 hw/virtio/virtio-mmio.c         | 48 +---------------------
 2 files changed, 74 insertions(+), 47 deletions(-)
 create mode 100644 include/hw/virtio/virtio-mmio.h

diff --git a/include/hw/virtio/virtio-mmio.h b/include/hw/virtio/virtio-mmio.h
new file mode 100644
index 0000000000..7dbfd03dcf
--- /dev/null
+++ b/include/hw/virtio/virtio-mmio.h
@@ -0,0 +1,73 @@
+/*
+ * Virtio MMIO bindings
+ *
+ * Copyright (c) 2011 Linaro Limited
+ *
+ * Author:
+ *  Peter Maydell <peter.maydell@linaro.org>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License; either version 2
+ * of the License, or (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License along
+ * with this program; if not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_VIRTIO_MMIO_H
+#define HW_VIRTIO_MMIO_H
+
+#include "hw/virtio/virtio-bus.h"
+
+/* QOM macros */
+/* virtio-mmio-bus */
+#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus"
+#define VIRTIO_MMIO_BUS(obj) \
+        OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS)
+#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \
+        OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS)
+#define VIRTIO_MMIO_BUS_CLASS(klass) \
+        OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS)
+
+/* virtio-mmio */
+#define TYPE_VIRTIO_MMIO "virtio-mmio"
+#define VIRTIO_MMIO(obj) \
+        OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO)
+
+#define VIRT_MAGIC 0x74726976 /* 'virt' */
+#define VIRT_VERSION 2
+#define VIRT_VERSION_LEGACY 1
+#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */
+
+typedef struct VirtIOMMIOQueue {
+    uint16_t num;
+    bool enabled;
+    uint32_t desc[2];
+    uint32_t avail[2];
+    uint32_t used[2];
+} VirtIOMMIOQueue;
+
+typedef struct {
+    /* Generic */
+    SysBusDevice parent_obj;
+    MemoryRegion iomem;
+    qemu_irq irq;
+    bool legacy;
+    /* Guest accessible state needing migration and reset */
+    uint32_t host_features_sel;
+    uint32_t guest_features_sel;
+    uint32_t guest_page_shift;
+    /* virtio-bus */
+    VirtioBusState bus;
+    bool format_transport_address;
+    /* Fields only used for non-legacy (v2) devices */
+    uint32_t guest_features[2];
+    VirtIOMMIOQueue vqs[VIRTIO_QUEUE_MAX];
+} VirtIOMMIOProxy;
+
+#endif
diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
index 3d5ca0f667..94d934c44b 100644
--- a/hw/virtio/virtio-mmio.c
+++ b/hw/virtio/virtio-mmio.c
@@ -29,57 +29,11 @@
 #include "qemu/host-utils.h"
 #include "qemu/module.h"
 #include "sysemu/kvm.h"
-#include "hw/virtio/virtio-bus.h"
+#include "hw/virtio/virtio-mmio.h"
 #include "qemu/error-report.h"
 #include "qemu/log.h"
 #include "trace.h"
 
-/* QOM macros */
-/* virtio-mmio-bus */
-#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus"
-#define VIRTIO_MMIO_BUS(obj) \
-        OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS)
-#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \
-        OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS)
-#define VIRTIO_MMIO_BUS_CLASS(klass) \
-        OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS)
-
-/* virtio-mmio */
-#define TYPE_VIRTIO_MMIO "virtio-mmio"
-#define VIRTIO_MMIO(obj) \
-        OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO)
-
-#define VIRT_MAGIC 0x74726976 /* 'virt' */
-#define VIRT_VERSION 2
-#define VIRT_VERSION_LEGACY 1
-#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */
-
-typedef struct VirtIOMMIOQueue {
-    uint16_t num;
-    bool enabled;
-    uint32_t desc[2];
-    uint32_t avail[2];
-    uint32_t used[2];
-} VirtIOMMIOQueue;
-
-typedef struct {
-    /* Generic */
-    SysBusDevice parent_obj;
-    MemoryRegion iomem;
-    qemu_irq irq;
-    bool legacy;
-    /* Guest accessible state needing migration and reset */
-    uint32_t host_features_sel;
-    uint32_t guest_features_sel;
-    uint32_t guest_page_shift;
-    /* virtio-bus */
-    VirtioBusState bus;
-    bool format_transport_address;
-    /* Fields only used for non-legacy (v2) devices */
-    uint32_t guest_features[2];
-    VirtIOMMIOQueue vqs[VIRTIO_QUEUE_MAX];
-} VirtIOMMIOProxy;
-
 static bool virtio_mmio_ioeventfd_enabled(DeviceState *d)
 {
     return kvm_eventfds_enabled();
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 02/10] hw/i386/pc: rename functions shared with non-PC machines
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
  2019-10-04  9:37 ` [PATCH v6 01/10] hw/virtio: Factorize virtio-mmio headers Sergio Lopez
@ 2019-10-04  9:37 ` Sergio Lopez
  2019-10-04  9:46   ` Philippe Mathieu-Daudé
  2019-10-04 11:24   ` Stefano Garzarella
  2019-10-04  9:37 ` [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them Sergio Lopez
                   ` (11 subsequent siblings)
  13 siblings, 2 replies; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

The following functions are named *pc* but are not PC-machine specific
but generic to the X86 architecture, rename them:

  load_linux                 -> x86_load_linux
  pc_new_cpu                 -> x86_new_cpu
  pc_cpus_init               -> x86_cpus_init
  pc_cpu_index_to_props      -> x86_cpu_index_to_props
  pc_get_default_cpu_node_id -> x86_get_default_cpu_node_id
  pc_possible_cpu_arch_ids   -> x86_possible_cpu_arch_ids
  old_pc_system_rom_init     -> x86_system_rom_init

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 include/hw/i386/pc.h |  2 +-
 hw/i386/pc.c         | 28 ++++++++++++++--------------
 hw/i386/pc_piix.c    |  2 +-
 hw/i386/pc_q35.c     |  2 +-
 hw/i386/pc_sysfw.c   |  6 +++---
 5 files changed, 20 insertions(+), 20 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 6df4f4b6fb..d12f42e9e5 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -195,7 +195,7 @@ bool pc_machine_is_smm_enabled(PCMachineState *pcms);
 void pc_register_ferr_irq(qemu_irq irq);
 void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
 
-void pc_cpus_init(PCMachineState *pcms);
+void x86_cpus_init(PCMachineState *pcms);
 void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp);
 void pc_smp_parse(MachineState *ms, QemuOpts *opts);
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index bcda50efcc..fd08c6704b 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -1019,8 +1019,8 @@ static bool load_elfboot(const char *kernel_filename,
     return true;
 }
 
-static void load_linux(PCMachineState *pcms,
-                       FWCfgState *fw_cfg)
+static void x86_load_linux(PCMachineState *pcms,
+                           FWCfgState *fw_cfg)
 {
     uint16_t protocol;
     int setup_size, kernel_size, cmdline_size;
@@ -1374,7 +1374,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
     }
 }
 
-static void pc_new_cpu(PCMachineState *pcms, int64_t apic_id, Error **errp)
+static void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
 {
     Object *cpu = NULL;
     Error *local_err = NULL;
@@ -1490,14 +1490,14 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
         return;
     }
 
-    pc_new_cpu(PC_MACHINE(ms), apic_id, &local_err);
+    x86_cpu_new(PC_MACHINE(ms), apic_id, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
     }
 }
 
-void pc_cpus_init(PCMachineState *pcms)
+void x86_cpus_init(PCMachineState *pcms)
 {
     int i;
     const CPUArchIdList *possible_cpus;
@@ -1518,7 +1518,7 @@ void pc_cpus_init(PCMachineState *pcms)
                                                      ms->smp.max_cpus - 1) + 1;
     possible_cpus = mc->possible_cpu_arch_ids(ms);
     for (i = 0; i < ms->smp.cpus; i++) {
-        pc_new_cpu(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
+        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
     }
 }
 
@@ -1621,7 +1621,7 @@ void xen_load_linux(PCMachineState *pcms)
     fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
     rom_set_fw(fw_cfg);
 
-    load_linux(pcms, fw_cfg);
+    x86_load_linux(pcms, fw_cfg);
     for (i = 0; i < nb_option_roms; i++) {
         assert(!strcmp(option_rom[i].name, "linuxboot.bin") ||
                !strcmp(option_rom[i].name, "linuxboot_dma.bin") ||
@@ -1756,7 +1756,7 @@ void pc_memory_init(PCMachineState *pcms,
     }
 
     if (linux_boot) {
-        load_linux(pcms, fw_cfg);
+        x86_load_linux(pcms, fw_cfg);
     }
 
     for (i = 0; i < nb_option_roms; i++) {
@@ -2678,7 +2678,7 @@ static void pc_machine_wakeup(MachineState *machine)
 }
 
 static CpuInstanceProperties
-pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
+x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 {
     MachineClass *mc = MACHINE_GET_CLASS(ms);
     const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
@@ -2687,7 +2687,7 @@ pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
     return possible_cpus->cpus[cpu_index].props;
 }
 
-static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
+static int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
 {
    X86CPUTopoInfo topo;
    PCMachineState *pcms = PC_MACHINE(ms);
@@ -2699,7 +2699,7 @@ static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
    return topo.pkg_id % ms->numa_state->num_nodes;
 }
 
-static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
+static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
 {
     PCMachineState *pcms = PC_MACHINE(ms);
     int i;
@@ -2801,9 +2801,9 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     assert(!mc->get_hotplug_handler);
     mc->get_hotplug_handler = pc_get_hotplug_handler;
     mc->hotplug_allowed = pc_hotplug_allowed;
-    mc->cpu_index_to_instance_props = pc_cpu_index_to_props;
-    mc->get_default_cpu_node_id = pc_get_default_cpu_node_id;
-    mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;
+    mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
+    mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
+    mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
     mc->auto_enable_numa_with_memhp = true;
     mc->has_hotpluggable_cpus = true;
     mc->default_boot_order = "cad";
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 6824b72124..de09e076cd 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -152,7 +152,7 @@ static void pc_init1(MachineState *machine,
         }
     }
 
-    pc_cpus_init(pcms);
+    x86_cpus_init(pcms);
 
     if (kvm_enabled() && pcmc->kvmclock_enabled) {
         kvmclock_create();
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 8fad20f314..894989b64e 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -179,7 +179,7 @@ static void pc_q35_init(MachineState *machine)
         xen_hvm_init(pcms, &ram_memory);
     }
 
-    pc_cpus_init(pcms);
+    x86_cpus_init(pcms);
 
     kvmclock_create();
 
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index a9983f0bfb..28cb1f63c9 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -211,7 +211,7 @@ static void pc_system_flash_map(PCMachineState *pcms,
     }
 }
 
-static void old_pc_system_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
+static void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
 {
     char *filename;
     MemoryRegion *bios, *isa_bios;
@@ -272,7 +272,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
     BlockBackend *pflash_blk[ARRAY_SIZE(pcms->flash)];
 
     if (!pcmc->pci_enabled) {
-        old_pc_system_rom_init(rom_memory, true);
+        x86_bios_rom_init(rom_memory, true);
         return;
     }
 
@@ -293,7 +293,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
 
     if (!pflash_blk[0]) {
         /* Machine property pflash0 not set, use ROM mode */
-        old_pc_system_rom_init(rom_memory, false);
+        x86_bios_rom_init(rom_memory, false);
     } else {
         if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
             /*
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
  2019-10-04  9:37 ` [PATCH v6 01/10] hw/virtio: Factorize virtio-mmio headers Sergio Lopez
  2019-10-04  9:37 ` [PATCH v6 02/10] hw/i386/pc: rename functions shared with non-PC machines Sergio Lopez
@ 2019-10-04  9:37 ` Sergio Lopez
  2019-10-04  9:46   ` Philippe Mathieu-Daudé
                     ` (2 more replies)
  2019-10-04  9:37 ` [PATCH v6 04/10] hw/i386: split PCMachineState deriving X86MachineState from it Sergio Lopez
                   ` (10 subsequent siblings)
  13 siblings, 3 replies; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

Move x86 functions that will be shared between PC and non-PC machine
types to x86.c, along with their helpers.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 include/hw/i386/pc.h  |   1 -
 include/hw/i386/x86.h |  35 +++
 hw/i386/pc.c          | 582 +----------------------------------
 hw/i386/pc_piix.c     |   1 +
 hw/i386/pc_q35.c      |   1 +
 hw/i386/pc_sysfw.c    |  54 +---
 hw/i386/x86.c         | 684 ++++++++++++++++++++++++++++++++++++++++++
 hw/i386/Makefile.objs |   1 +
 8 files changed, 724 insertions(+), 635 deletions(-)
 create mode 100644 include/hw/i386/x86.h
 create mode 100644 hw/i386/x86.c

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index d12f42e9e5..73e2847e87 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -195,7 +195,6 @@ bool pc_machine_is_smm_enabled(PCMachineState *pcms);
 void pc_register_ferr_irq(qemu_irq irq);
 void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
 
-void x86_cpus_init(PCMachineState *pcms);
 void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp);
 void pc_smp_parse(MachineState *ms, QemuOpts *opts);
 
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
new file mode 100644
index 0000000000..71e2b6985d
--- /dev/null
+++ b/include/hw/i386/x86.h
@@ -0,0 +1,35 @@
+/*
+ * Copyright (c) 2019 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_I386_X86_H
+#define HW_I386_X86_H
+
+#include "hw/boards.h"
+
+uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
+                                    unsigned int cpu_index);
+void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp);
+void x86_cpus_init(PCMachineState *pcms);
+CpuInstanceProperties x86_cpu_index_to_props(MachineState *ms,
+                                             unsigned cpu_index);
+int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx);
+const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
+
+void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw);
+
+void x86_load_linux(PCMachineState *x86ms, FWCfgState *fw_cfg);
+
+#endif
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index fd08c6704b..094db79fb0 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -24,6 +24,7 @@
 
 #include "qemu/osdep.h"
 #include "qemu/units.h"
+#include "hw/i386/x86.h"
 #include "hw/i386/pc.h"
 #include "hw/char/serial.h"
 #include "hw/char/parallel.h"
@@ -102,9 +103,6 @@
 
 struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
 
-/* Physical Address of PVH entry point read from kernel ELF NOTE */
-static size_t pvh_start_addr;
-
 GlobalProperty pc_compat_4_1[] = {};
 const size_t pc_compat_4_1_len = G_N_ELEMENTS(pc_compat_4_1);
 
@@ -866,478 +864,6 @@ static void handle_a20_line_change(void *opaque, int irq, int level)
     x86_cpu_set_a20(cpu, level);
 }
 
-/* Calculates initial APIC ID for a specific CPU index
- *
- * Currently we need to be able to calculate the APIC ID from the CPU index
- * alone (without requiring a CPU object), as the QEMU<->Seabios interfaces have
- * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
- * all CPUs up to max_cpus.
- */
-static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
-                                           unsigned int cpu_index)
-{
-    MachineState *ms = MACHINE(pcms);
-    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
-    uint32_t correct_id;
-    static bool warned;
-
-    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
-                                         ms->smp.threads, cpu_index);
-    if (pcmc->compat_apic_id_mode) {
-        if (cpu_index != correct_id && !warned && !qtest_enabled()) {
-            error_report("APIC IDs set in compatibility mode, "
-                         "CPU topology won't match the configuration");
-            warned = true;
-        }
-        return cpu_index;
-    } else {
-        return correct_id;
-    }
-}
-
-static long get_file_size(FILE *f)
-{
-    long where, size;
-
-    /* XXX: on Unix systems, using fstat() probably makes more sense */
-
-    where = ftell(f);
-    fseek(f, 0, SEEK_END);
-    size = ftell(f);
-    fseek(f, where, SEEK_SET);
-
-    return size;
-}
-
-struct setup_data {
-    uint64_t next;
-    uint32_t type;
-    uint32_t len;
-    uint8_t data[0];
-} __attribute__((packed));
-
-
-/*
- * The entry point into the kernel for PVH boot is different from
- * the native entry point.  The PVH entry is defined by the x86/HVM
- * direct boot ABI and is available in an ELFNOTE in the kernel binary.
- *
- * This function is passed to load_elf() when it is called from
- * load_elfboot() which then additionally checks for an ELF Note of
- * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
- * parse the PVH entry address from the ELF Note.
- *
- * Due to trickery in elf_opts.h, load_elf() is actually available as
- * load_elf32() or load_elf64() and this routine needs to be able
- * to deal with being called as 32 or 64 bit.
- *
- * The address of the PVH entry point is saved to the 'pvh_start_addr'
- * global variable.  (although the entry point is 32-bit, the kernel
- * binary can be either 32-bit or 64-bit).
- */
-static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
-{
-    size_t *elf_note_data_addr;
-
-    /* Check if ELF Note header passed in is valid */
-    if (arg1 == NULL) {
-        return 0;
-    }
-
-    if (is64) {
-        struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
-        uint64_t nhdr_size64 = sizeof(struct elf64_note);
-        uint64_t phdr_align = *(uint64_t *)arg2;
-        uint64_t nhdr_namesz = nhdr64->n_namesz;
-
-        elf_note_data_addr =
-            ((void *)nhdr64) + nhdr_size64 +
-            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
-    } else {
-        struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
-        uint32_t nhdr_size32 = sizeof(struct elf32_note);
-        uint32_t phdr_align = *(uint32_t *)arg2;
-        uint32_t nhdr_namesz = nhdr32->n_namesz;
-
-        elf_note_data_addr =
-            ((void *)nhdr32) + nhdr_size32 +
-            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
-    }
-
-    pvh_start_addr = *elf_note_data_addr;
-
-    return pvh_start_addr;
-}
-
-static bool load_elfboot(const char *kernel_filename,
-                   int kernel_file_size,
-                   uint8_t *header,
-                   size_t pvh_xen_start_addr,
-                   FWCfgState *fw_cfg)
-{
-    uint32_t flags = 0;
-    uint32_t mh_load_addr = 0;
-    uint32_t elf_kernel_size = 0;
-    uint64_t elf_entry;
-    uint64_t elf_low, elf_high;
-    int kernel_size;
-
-    if (ldl_p(header) != 0x464c457f) {
-        return false; /* no elfboot */
-    }
-
-    bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
-    flags = elf_is64 ?
-        ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
-
-    if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
-        error_report("elfboot unsupported flags = %x", flags);
-        exit(1);
-    }
-
-    uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
-    kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
-                           NULL, &elf_note_type, &elf_entry,
-                           &elf_low, &elf_high, 0, I386_ELF_MACHINE,
-                           0, 0);
-
-    if (kernel_size < 0) {
-        error_report("Error while loading elf kernel");
-        exit(1);
-    }
-    mh_load_addr = elf_low;
-    elf_kernel_size = elf_high - elf_low;
-
-    if (pvh_start_addr == 0) {
-        error_report("Error loading uncompressed kernel without PVH ELF Note");
-        exit(1);
-    }
-    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
-    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
-    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
-
-    return true;
-}
-
-static void x86_load_linux(PCMachineState *pcms,
-                           FWCfgState *fw_cfg)
-{
-    uint16_t protocol;
-    int setup_size, kernel_size, cmdline_size;
-    int dtb_size, setup_data_offset;
-    uint32_t initrd_max;
-    uint8_t header[8192], *setup, *kernel;
-    hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
-    FILE *f;
-    char *vmode;
-    MachineState *machine = MACHINE(pcms);
-    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
-    struct setup_data *setup_data;
-    const char *kernel_filename = machine->kernel_filename;
-    const char *initrd_filename = machine->initrd_filename;
-    const char *dtb_filename = machine->dtb;
-    const char *kernel_cmdline = machine->kernel_cmdline;
-
-    /* Align to 16 bytes as a paranoia measure */
-    cmdline_size = (strlen(kernel_cmdline)+16) & ~15;
-
-    /* load the kernel header */
-    f = fopen(kernel_filename, "rb");
-    if (!f || !(kernel_size = get_file_size(f)) ||
-        fread(header, 1, MIN(ARRAY_SIZE(header), kernel_size), f) !=
-        MIN(ARRAY_SIZE(header), kernel_size)) {
-        fprintf(stderr, "qemu: could not load kernel '%s': %s\n",
-                kernel_filename, strerror(errno));
-        exit(1);
-    }
-
-    /* kernel protocol version */
-#if 0
-    fprintf(stderr, "header magic: %#x\n", ldl_p(header+0x202));
-#endif
-    if (ldl_p(header+0x202) == 0x53726448) {
-        protocol = lduw_p(header+0x206);
-    } else {
-        /*
-         * This could be a multiboot kernel. If it is, let's stop treating it
-         * like a Linux kernel.
-         * Note: some multiboot images could be in the ELF format (the same of
-         * PVH), so we try multiboot first since we check the multiboot magic
-         * header before to load it.
-         */
-        if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
-                           kernel_cmdline, kernel_size, header)) {
-            return;
-        }
-        /*
-         * Check if the file is an uncompressed kernel file (ELF) and load it,
-         * saving the PVH entry point used by the x86/HVM direct boot ABI.
-         * If load_elfboot() is successful, populate the fw_cfg info.
-         */
-        if (pcmc->pvh_enabled &&
-            load_elfboot(kernel_filename, kernel_size,
-                         header, pvh_start_addr, fw_cfg)) {
-            fclose(f);
-
-            fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
-                strlen(kernel_cmdline) + 1);
-            fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
-
-            fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
-            fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
-                             header, sizeof(header));
-
-            /* load initrd */
-            if (initrd_filename) {
-                GMappedFile *mapped_file;
-                gsize initrd_size;
-                gchar *initrd_data;
-                GError *gerr = NULL;
-
-                mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
-                if (!mapped_file) {
-                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
-                            initrd_filename, gerr->message);
-                    exit(1);
-                }
-                pcms->initrd_mapped_file = mapped_file;
-
-                initrd_data = g_mapped_file_get_contents(mapped_file);
-                initrd_size = g_mapped_file_get_length(mapped_file);
-                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
-                if (initrd_size >= initrd_max) {
-                    fprintf(stderr, "qemu: initrd is too large, cannot support."
-                            "(max: %"PRIu32", need %"PRId64")\n",
-                            initrd_max, (uint64_t)initrd_size);
-                    exit(1);
-                }
-
-                initrd_addr = (initrd_max - initrd_size) & ~4095;
-
-                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
-                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
-                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
-                                 initrd_size);
-            }
-
-            option_rom[nb_option_roms].bootindex = 0;
-            option_rom[nb_option_roms].name = "pvh.bin";
-            nb_option_roms++;
-
-            return;
-        }
-        protocol = 0;
-    }
-
-    if (protocol < 0x200 || !(header[0x211] & 0x01)) {
-        /* Low kernel */
-        real_addr    = 0x90000;
-        cmdline_addr = 0x9a000 - cmdline_size;
-        prot_addr    = 0x10000;
-    } else if (protocol < 0x202) {
-        /* High but ancient kernel */
-        real_addr    = 0x90000;
-        cmdline_addr = 0x9a000 - cmdline_size;
-        prot_addr    = 0x100000;
-    } else {
-        /* High and recent kernel */
-        real_addr    = 0x10000;
-        cmdline_addr = 0x20000;
-        prot_addr    = 0x100000;
-    }
-
-#if 0
-    fprintf(stderr,
-            "qemu: real_addr     = 0x" TARGET_FMT_plx "\n"
-            "qemu: cmdline_addr  = 0x" TARGET_FMT_plx "\n"
-            "qemu: prot_addr     = 0x" TARGET_FMT_plx "\n",
-            real_addr,
-            cmdline_addr,
-            prot_addr);
-#endif
-
-    /* highest address for loading the initrd */
-    if (protocol >= 0x20c &&
-        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
-        /*
-         * Linux has supported initrd up to 4 GB for a very long time (2007,
-         * long before XLF_CAN_BE_LOADED_ABOVE_4G which was added in 2013),
-         * though it only sets initrd_max to 2 GB to "work around bootloader
-         * bugs". Luckily, QEMU firmware(which does something like bootloader)
-         * has supported this.
-         *
-         * It's believed that if XLF_CAN_BE_LOADED_ABOVE_4G is set, initrd can
-         * be loaded into any address.
-         *
-         * In addition, initrd_max is uint32_t simply because QEMU doesn't
-         * support the 64-bit boot protocol (specifically the ext_ramdisk_image
-         * field).
-         *
-         * Therefore here just limit initrd_max to UINT32_MAX simply as well.
-         */
-        initrd_max = UINT32_MAX;
-    } else if (protocol >= 0x203) {
-        initrd_max = ldl_p(header+0x22c);
-    } else {
-        initrd_max = 0x37ffffff;
-    }
-
-    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
-        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
-    }
-
-    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
-    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline)+1);
-    fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
-
-    if (protocol >= 0x202) {
-        stl_p(header+0x228, cmdline_addr);
-    } else {
-        stw_p(header+0x20, 0xA33F);
-        stw_p(header+0x22, cmdline_addr-real_addr);
-    }
-
-    /* handle vga= parameter */
-    vmode = strstr(kernel_cmdline, "vga=");
-    if (vmode) {
-        unsigned int video_mode;
-        /* skip "vga=" */
-        vmode += 4;
-        if (!strncmp(vmode, "normal", 6)) {
-            video_mode = 0xffff;
-        } else if (!strncmp(vmode, "ext", 3)) {
-            video_mode = 0xfffe;
-        } else if (!strncmp(vmode, "ask", 3)) {
-            video_mode = 0xfffd;
-        } else {
-            video_mode = strtol(vmode, NULL, 0);
-        }
-        stw_p(header+0x1fa, video_mode);
-    }
-
-    /* loader type */
-    /* High nybble = B reserved for QEMU; low nybble is revision number.
-       If this code is substantially changed, you may want to consider
-       incrementing the revision. */
-    if (protocol >= 0x200) {
-        header[0x210] = 0xB0;
-    }
-    /* heap */
-    if (protocol >= 0x201) {
-        header[0x211] |= 0x80;	/* CAN_USE_HEAP */
-        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
-    }
-
-    /* load initrd */
-    if (initrd_filename) {
-        GMappedFile *mapped_file;
-        gsize initrd_size;
-        gchar *initrd_data;
-        GError *gerr = NULL;
-
-        if (protocol < 0x200) {
-            fprintf(stderr, "qemu: linux kernel too old to load a ram disk\n");
-            exit(1);
-        }
-
-        mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
-        if (!mapped_file) {
-            fprintf(stderr, "qemu: error reading initrd %s: %s\n",
-                    initrd_filename, gerr->message);
-            exit(1);
-        }
-        pcms->initrd_mapped_file = mapped_file;
-
-        initrd_data = g_mapped_file_get_contents(mapped_file);
-        initrd_size = g_mapped_file_get_length(mapped_file);
-        if (initrd_size >= initrd_max) {
-            fprintf(stderr, "qemu: initrd is too large, cannot support."
-                    "(max: %"PRIu32", need %"PRId64")\n",
-                    initrd_max, (uint64_t)initrd_size);
-            exit(1);
-        }
-
-        initrd_addr = (initrd_max-initrd_size) & ~4095;
-
-        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
-        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
-        fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
-
-        stl_p(header+0x218, initrd_addr);
-        stl_p(header+0x21c, initrd_size);
-    }
-
-    /* load kernel and setup */
-    setup_size = header[0x1f1];
-    if (setup_size == 0) {
-        setup_size = 4;
-    }
-    setup_size = (setup_size+1)*512;
-    if (setup_size > kernel_size) {
-        fprintf(stderr, "qemu: invalid kernel header\n");
-        exit(1);
-    }
-    kernel_size -= setup_size;
-
-    setup  = g_malloc(setup_size);
-    kernel = g_malloc(kernel_size);
-    fseek(f, 0, SEEK_SET);
-    if (fread(setup, 1, setup_size, f) != setup_size) {
-        fprintf(stderr, "fread() failed\n");
-        exit(1);
-    }
-    if (fread(kernel, 1, kernel_size, f) != kernel_size) {
-        fprintf(stderr, "fread() failed\n");
-        exit(1);
-    }
-    fclose(f);
-
-    /* append dtb to kernel */
-    if (dtb_filename) {
-        if (protocol < 0x209) {
-            fprintf(stderr, "qemu: Linux kernel too old to load a dtb\n");
-            exit(1);
-        }
-
-        dtb_size = get_image_size(dtb_filename);
-        if (dtb_size <= 0) {
-            fprintf(stderr, "qemu: error reading dtb %s: %s\n",
-                    dtb_filename, strerror(errno));
-            exit(1);
-        }
-
-        setup_data_offset = QEMU_ALIGN_UP(kernel_size, 16);
-        kernel_size = setup_data_offset + sizeof(struct setup_data) + dtb_size;
-        kernel = g_realloc(kernel, kernel_size);
-
-        stq_p(header+0x250, prot_addr + setup_data_offset);
-
-        setup_data = (struct setup_data *)(kernel + setup_data_offset);
-        setup_data->next = 0;
-        setup_data->type = cpu_to_le32(SETUP_DTB);
-        setup_data->len = cpu_to_le32(dtb_size);
-
-        load_image_size(dtb_filename, setup_data->data, dtb_size);
-    }
-
-    memcpy(setup, header, MIN(sizeof(header), setup_size));
-
-    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, prot_addr);
-    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, kernel_size);
-    fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, kernel, kernel_size);
-
-    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_ADDR, real_addr);
-    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
-    fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
-
-    option_rom[nb_option_roms].bootindex = 0;
-    option_rom[nb_option_roms].name = "linuxboot.bin";
-    if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
-        option_rom[nb_option_roms].name = "linuxboot_dma.bin";
-    }
-    nb_option_roms++;
-}
-
 #define NE2000_NB_MAX 6
 
 static const int ne2000_io[NE2000_NB_MAX] = { 0x300, 0x320, 0x340, 0x360,
@@ -1374,24 +900,6 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
     }
 }
 
-static void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
-{
-    Object *cpu = NULL;
-    Error *local_err = NULL;
-    CPUX86State *env = NULL;
-
-    cpu = object_new(MACHINE(pcms)->cpu_type);
-
-    env = &X86_CPU(cpu)->env;
-    env->nr_dies = pcms->smp_dies;
-
-    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
-    object_property_set_bool(cpu, true, "realized", &local_err);
-
-    object_unref(cpu);
-    error_propagate(errp, local_err);
-}
-
 /*
  * This function is very similar to smp_parse()
  * in hw/core/machine.c but includes CPU die support.
@@ -1497,31 +1005,6 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
     }
 }
 
-void x86_cpus_init(PCMachineState *pcms)
-{
-    int i;
-    const CPUArchIdList *possible_cpus;
-    MachineState *ms = MACHINE(pcms);
-    MachineClass *mc = MACHINE_GET_CLASS(pcms);
-    PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
-
-    x86_cpu_set_default_version(pcmc->default_cpu_version);
-
-    /* Calculates the limit to CPU APIC ID values
-     *
-     * Limit for the APIC ID value, so that all
-     * CPU APIC IDs are < pcms->apic_id_limit.
-     *
-     * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
-     */
-    pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
-                                                     ms->smp.max_cpus - 1) + 1;
-    possible_cpus = mc->possible_cpu_arch_ids(ms);
-    for (i = 0; i < ms->smp.cpus; i++) {
-        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
-    }
-}
-
 static void rtc_set_cpus_count(ISADevice *rtc, uint16_t cpus_count)
 {
     if (cpus_count > 0xff) {
@@ -2677,69 +2160,6 @@ static void pc_machine_wakeup(MachineState *machine)
     cpu_synchronize_all_post_reset();
 }
 
-static CpuInstanceProperties
-x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
-{
-    MachineClass *mc = MACHINE_GET_CLASS(ms);
-    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
-
-    assert(cpu_index < possible_cpus->len);
-    return possible_cpus->cpus[cpu_index].props;
-}
-
-static int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
-{
-   X86CPUTopoInfo topo;
-   PCMachineState *pcms = PC_MACHINE(ms);
-
-   assert(idx < ms->possible_cpus->len);
-   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
-                            pcms->smp_dies, ms->smp.cores,
-                            ms->smp.threads, &topo);
-   return topo.pkg_id % ms->numa_state->num_nodes;
-}
-
-static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
-{
-    PCMachineState *pcms = PC_MACHINE(ms);
-    int i;
-    unsigned int max_cpus = ms->smp.max_cpus;
-
-    if (ms->possible_cpus) {
-        /*
-         * make sure that max_cpus hasn't changed since the first use, i.e.
-         * -smp hasn't been parsed after it
-        */
-        assert(ms->possible_cpus->len == max_cpus);
-        return ms->possible_cpus;
-    }
-
-    ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
-                                  sizeof(CPUArchId) * max_cpus);
-    ms->possible_cpus->len = max_cpus;
-    for (i = 0; i < ms->possible_cpus->len; i++) {
-        X86CPUTopoInfo topo;
-
-        ms->possible_cpus->cpus[i].type = ms->cpu_type;
-        ms->possible_cpus->cpus[i].vcpus_count = 1;
-        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
-        x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
-                                 pcms->smp_dies, ms->smp.cores,
-                                 ms->smp.threads, &topo);
-        ms->possible_cpus->cpus[i].props.has_socket_id = true;
-        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
-        if (pcms->smp_dies > 1) {
-            ms->possible_cpus->cpus[i].props.has_die_id = true;
-            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
-        }
-        ms->possible_cpus->cpus[i].props.has_core_id = true;
-        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
-        ms->possible_cpus->cpus[i].props.has_thread_id = true;
-        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
-    }
-    return ms->possible_cpus;
-}
-
 static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
 {
     /* cpu index isn't used */
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index de09e076cd..1396451abf 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -27,6 +27,7 @@
 
 #include "qemu/units.h"
 #include "hw/loader.h"
+#include "hw/i386/x86.h"
 #include "hw/i386/pc.h"
 #include "hw/i386/apic.h"
 #include "hw/display/ramfb.h"
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 894989b64e..8920bd8978 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -41,6 +41,7 @@
 #include "hw/pci-host/q35.h"
 #include "hw/qdev-properties.h"
 #include "exec/address-spaces.h"
+#include "hw/i386/x86.h"
 #include "hw/i386/pc.h"
 #include "hw/i386/ich9.h"
 #include "hw/i386/amd_iommu.h"
diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
index 28cb1f63c9..69b79851be 100644
--- a/hw/i386/pc_sysfw.c
+++ b/hw/i386/pc_sysfw.c
@@ -31,6 +31,7 @@
 #include "qemu/option.h"
 #include "qemu/units.h"
 #include "hw/sysbus.h"
+#include "hw/i386/x86.h"
 #include "hw/i386/pc.h"
 #include "hw/loader.h"
 #include "hw/qdev-properties.h"
@@ -211,59 +212,6 @@ static void pc_system_flash_map(PCMachineState *pcms,
     }
 }
 
-static void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
-{
-    char *filename;
-    MemoryRegion *bios, *isa_bios;
-    int bios_size, isa_bios_size;
-    int ret;
-
-    /* BIOS load */
-    if (bios_name == NULL) {
-        bios_name = BIOS_FILENAME;
-    }
-    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
-    if (filename) {
-        bios_size = get_image_size(filename);
-    } else {
-        bios_size = -1;
-    }
-    if (bios_size <= 0 ||
-        (bios_size % 65536) != 0) {
-        goto bios_error;
-    }
-    bios = g_malloc(sizeof(*bios));
-    memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
-    if (!isapc_ram_fw) {
-        memory_region_set_readonly(bios, true);
-    }
-    ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
-    if (ret != 0) {
-    bios_error:
-        fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
-        exit(1);
-    }
-    g_free(filename);
-
-    /* map the last 128KB of the BIOS in ISA space */
-    isa_bios_size = MIN(bios_size, 128 * KiB);
-    isa_bios = g_malloc(sizeof(*isa_bios));
-    memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
-                             bios_size - isa_bios_size, isa_bios_size);
-    memory_region_add_subregion_overlap(rom_memory,
-                                        0x100000 - isa_bios_size,
-                                        isa_bios,
-                                        1);
-    if (!isapc_ram_fw) {
-        memory_region_set_readonly(isa_bios, true);
-    }
-
-    /* map all the bios at the top of memory */
-    memory_region_add_subregion(rom_memory,
-                                (uint32_t)(-bios_size),
-                                bios);
-}
-
 void pc_system_firmware_init(PCMachineState *pcms,
                              MemoryRegion *rom_memory)
 {
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
new file mode 100644
index 0000000000..6807bb8a22
--- /dev/null
+++ b/hw/i386/x86.c
@@ -0,0 +1,684 @@
+/*
+ * Copyright (c) 2003-2004 Fabrice Bellard
+ * Copyright (c) 2019 Red Hat, Inc.
+ *
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to deal
+ * in the Software without restriction, including without limitation the rights
+ * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
+ * copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
+ * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
+ * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
+ * THE SOFTWARE.
+ */
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qemu/option.h"
+#include "qemu/cutils.h"
+#include "qemu/units.h"
+#include "qemu-common.h"
+#include "qapi/error.h"
+#include "qapi/qmp/qerror.h"
+#include "qapi/qapi-visit-common.h"
+#include "qapi/visitor.h"
+#include "sysemu/qtest.h"
+#include "sysemu/numa.h"
+#include "sysemu/replay.h"
+#include "sysemu/sysemu.h"
+
+#include "hw/i386/x86.h"
+#include "hw/i386/pc.h"
+#include "target/i386/cpu.h"
+#include "hw/i386/topology.h"
+#include "hw/i386/fw_cfg.h"
+
+#include "hw/acpi/cpu_hotplug.h"
+#include "hw/nmi.h"
+#include "hw/loader.h"
+#include "multiboot.h"
+#include "elf.h"
+#include "standard-headers/asm-x86/bootparam.h"
+
+#define BIOS_FILENAME "bios.bin"
+
+/* Physical Address of PVH entry point read from kernel ELF NOTE */
+static size_t pvh_start_addr;
+
+/* Calculates initial APIC ID for a specific CPU index
+ *
+ * Currently we need to be able to calculate the APIC ID from the CPU index
+ * alone (without requiring a CPU object), as the QEMU<->Seabios interfaces have
+ * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
+ * all CPUs up to max_cpus.
+ */
+uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
+                                    unsigned int cpu_index)
+{
+    MachineState *ms = MACHINE(pcms);
+    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+    uint32_t correct_id;
+    static bool warned;
+
+    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
+                                         ms->smp.threads, cpu_index);
+    if (pcmc->compat_apic_id_mode) {
+        if (cpu_index != correct_id && !warned && !qtest_enabled()) {
+            error_report("APIC IDs set in compatibility mode, "
+                         "CPU topology won't match the configuration");
+            warned = true;
+        }
+        return cpu_index;
+    } else {
+        return correct_id;
+    }
+}
+
+void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
+{
+    Object *cpu = NULL;
+    Error *local_err = NULL;
+    CPUX86State *env = NULL;
+
+    cpu = object_new(MACHINE(pcms)->cpu_type);
+
+    env = &X86_CPU(cpu)->env;
+    env->nr_dies = pcms->smp_dies;
+
+    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
+    object_property_set_bool(cpu, true, "realized", &local_err);
+
+    object_unref(cpu);
+    error_propagate(errp, local_err);
+}
+
+void x86_cpus_init(PCMachineState *pcms)
+{
+    int i;
+    const CPUArchIdList *possible_cpus;
+    MachineState *ms = MACHINE(pcms);
+    MachineClass *mc = MACHINE_GET_CLASS(pcms);
+    PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
+
+    x86_cpu_set_default_version(pcmc->default_cpu_version);
+
+    /* Calculates the limit to CPU APIC ID values
+     *
+     * Limit for the APIC ID value, so that all
+     * CPU APIC IDs are < pcms->apic_id_limit.
+     *
+     * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
+     */
+    pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
+                                                     ms->smp.max_cpus - 1) + 1;
+    possible_cpus = mc->possible_cpu_arch_ids(ms);
+    for (i = 0; i < ms->smp.cpus; i++) {
+        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
+    }
+}
+
+CpuInstanceProperties
+x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
+{
+    MachineClass *mc = MACHINE_GET_CLASS(ms);
+    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
+
+    assert(cpu_index < possible_cpus->len);
+    return possible_cpus->cpus[cpu_index].props;
+}
+
+int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
+{
+   X86CPUTopoInfo topo;
+   PCMachineState *pcms = PC_MACHINE(ms);
+
+   assert(idx < ms->possible_cpus->len);
+   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
+                            pcms->smp_dies, ms->smp.cores,
+                            ms->smp.threads, &topo);
+   return topo.pkg_id % ms->numa_state->num_nodes;
+}
+
+const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
+{
+    PCMachineState *pcms = PC_MACHINE(ms);
+    int i;
+    unsigned int max_cpus = ms->smp.max_cpus;
+
+    if (ms->possible_cpus) {
+        /*
+         * make sure that max_cpus hasn't changed since the first use, i.e.
+         * -smp hasn't been parsed after it
+        */
+        assert(ms->possible_cpus->len == max_cpus);
+        return ms->possible_cpus;
+    }
+
+    ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
+                                  sizeof(CPUArchId) * max_cpus);
+    ms->possible_cpus->len = max_cpus;
+    for (i = 0; i < ms->possible_cpus->len; i++) {
+        X86CPUTopoInfo topo;
+
+        ms->possible_cpus->cpus[i].type = ms->cpu_type;
+        ms->possible_cpus->cpus[i].vcpus_count = 1;
+        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
+        x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
+                                 pcms->smp_dies, ms->smp.cores,
+                                 ms->smp.threads, &topo);
+        ms->possible_cpus->cpus[i].props.has_socket_id = true;
+        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
+        if (pcms->smp_dies > 1) {
+            ms->possible_cpus->cpus[i].props.has_die_id = true;
+            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
+        }
+        ms->possible_cpus->cpus[i].props.has_core_id = true;
+        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
+        ms->possible_cpus->cpus[i].props.has_thread_id = true;
+        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
+    }
+    return ms->possible_cpus;
+}
+
+static long get_file_size(FILE *f)
+{
+    long where, size;
+
+    /* XXX: on Unix systems, using fstat() probably makes more sense */
+
+    where = ftell(f);
+    fseek(f, 0, SEEK_END);
+    size = ftell(f);
+    fseek(f, where, SEEK_SET);
+
+    return size;
+}
+
+struct setup_data {
+    uint64_t next;
+    uint32_t type;
+    uint32_t len;
+    uint8_t data[0];
+} __attribute__((packed));
+
+/*
+ * The entry point into the kernel for PVH boot is different from
+ * the native entry point.  The PVH entry is defined by the x86/HVM
+ * direct boot ABI and is available in an ELFNOTE in the kernel binary.
+ *
+ * This function is passed to load_elf() when it is called from
+ * load_elfboot() which then additionally checks for an ELF Note of
+ * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
+ * parse the PVH entry address from the ELF Note.
+ *
+ * Due to trickery in elf_opts.h, load_elf() is actually available as
+ * load_elf32() or load_elf64() and this routine needs to be able
+ * to deal with being called as 32 or 64 bit.
+ *
+ * The address of the PVH entry point is saved to the 'pvh_start_addr'
+ * global variable.  (although the entry point is 32-bit, the kernel
+ * binary can be either 32-bit or 64-bit).
+ */
+static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
+{
+    size_t *elf_note_data_addr;
+
+    /* Check if ELF Note header passed in is valid */
+    if (arg1 == NULL) {
+        return 0;
+    }
+
+    if (is64) {
+        struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
+        uint64_t nhdr_size64 = sizeof(struct elf64_note);
+        uint64_t phdr_align = *(uint64_t *)arg2;
+        uint64_t nhdr_namesz = nhdr64->n_namesz;
+
+        elf_note_data_addr =
+            ((void *)nhdr64) + nhdr_size64 +
+            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
+    } else {
+        struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
+        uint32_t nhdr_size32 = sizeof(struct elf32_note);
+        uint32_t phdr_align = *(uint32_t *)arg2;
+        uint32_t nhdr_namesz = nhdr32->n_namesz;
+
+        elf_note_data_addr =
+            ((void *)nhdr32) + nhdr_size32 +
+            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
+    }
+
+    pvh_start_addr = *elf_note_data_addr;
+
+    return pvh_start_addr;
+}
+
+static bool load_elfboot(const char *kernel_filename,
+                   int kernel_file_size,
+                   uint8_t *header,
+                   size_t pvh_xen_start_addr,
+                   FWCfgState *fw_cfg)
+{
+    uint32_t flags = 0;
+    uint32_t mh_load_addr = 0;
+    uint32_t elf_kernel_size = 0;
+    uint64_t elf_entry;
+    uint64_t elf_low, elf_high;
+    int kernel_size;
+
+    if (ldl_p(header) != 0x464c457f) {
+        return false; /* no elfboot */
+    }
+
+    bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
+    flags = elf_is64 ?
+        ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
+
+    if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
+        error_report("elfboot unsupported flags = %x", flags);
+        exit(1);
+    }
+
+    uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
+    kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
+                           NULL, &elf_note_type, &elf_entry,
+                           &elf_low, &elf_high, 0, I386_ELF_MACHINE,
+                           0, 0);
+
+    if (kernel_size < 0) {
+        error_report("Error while loading elf kernel");
+        exit(1);
+    }
+    mh_load_addr = elf_low;
+    elf_kernel_size = elf_high - elf_low;
+
+    if (pvh_start_addr == 0) {
+        error_report("Error loading uncompressed kernel without PVH ELF Note");
+        exit(1);
+    }
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
+
+    return true;
+}
+
+void x86_load_linux(PCMachineState *pcms,
+                    FWCfgState *fw_cfg)
+{
+    uint16_t protocol;
+    int setup_size, kernel_size, cmdline_size;
+    int dtb_size, setup_data_offset;
+    uint32_t initrd_max;
+    uint8_t header[8192], *setup, *kernel;
+    hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
+    FILE *f;
+    char *vmode;
+    MachineState *machine = MACHINE(pcms);
+    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+    struct setup_data *setup_data;
+    const char *kernel_filename = machine->kernel_filename;
+    const char *initrd_filename = machine->initrd_filename;
+    const char *dtb_filename = machine->dtb;
+    const char *kernel_cmdline = machine->kernel_cmdline;
+
+    /* Align to 16 bytes as a paranoia measure */
+    cmdline_size = (strlen(kernel_cmdline)+16) & ~15;
+
+    /* load the kernel header */
+    f = fopen(kernel_filename, "rb");
+    if (!f || !(kernel_size = get_file_size(f)) ||
+        fread(header, 1, MIN(ARRAY_SIZE(header), kernel_size), f) !=
+        MIN(ARRAY_SIZE(header), kernel_size)) {
+        fprintf(stderr, "qemu: could not load kernel '%s': %s\n",
+                kernel_filename, strerror(errno));
+        exit(1);
+    }
+
+    /* kernel protocol version */
+#if 0
+    fprintf(stderr, "header magic: %#x\n", ldl_p(header+0x202));
+#endif
+    if (ldl_p(header+0x202) == 0x53726448) {
+        protocol = lduw_p(header+0x206);
+    } else {
+        /*
+         * This could be a multiboot kernel. If it is, let's stop treating it
+         * like a Linux kernel.
+         * Note: some multiboot images could be in the ELF format (the same of
+         * PVH), so we try multiboot first since we check the multiboot magic
+         * header before to load it.
+         */
+        if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
+                           kernel_cmdline, kernel_size, header)) {
+            return;
+        }
+        /*
+         * Check if the file is an uncompressed kernel file (ELF) and load it,
+         * saving the PVH entry point used by the x86/HVM direct boot ABI.
+         * If load_elfboot() is successful, populate the fw_cfg info.
+         */
+        if (pcmc->pvh_enabled &&
+            load_elfboot(kernel_filename, kernel_size,
+                         header, pvh_start_addr, fw_cfg)) {
+            fclose(f);
+
+            fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
+                strlen(kernel_cmdline) + 1);
+            fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
+
+            fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
+            fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
+                             header, sizeof(header));
+
+            /* load initrd */
+            if (initrd_filename) {
+                GMappedFile *mapped_file;
+                gsize initrd_size;
+                gchar *initrd_data;
+                GError *gerr = NULL;
+
+                mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
+                if (!mapped_file) {
+                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
+                            initrd_filename, gerr->message);
+                    exit(1);
+                }
+                pcms->initrd_mapped_file = mapped_file;
+
+                initrd_data = g_mapped_file_get_contents(mapped_file);
+                initrd_size = g_mapped_file_get_length(mapped_file);
+                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
+                if (initrd_size >= initrd_max) {
+                    fprintf(stderr, "qemu: initrd is too large, cannot support."
+                            "(max: %"PRIu32", need %"PRId64")\n",
+                            initrd_max, (uint64_t)initrd_size);
+                    exit(1);
+                }
+
+                initrd_addr = (initrd_max - initrd_size) & ~4095;
+
+                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
+                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
+                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
+                                 initrd_size);
+            }
+
+            option_rom[nb_option_roms].bootindex = 0;
+            option_rom[nb_option_roms].name = "pvh.bin";
+            nb_option_roms++;
+
+            return;
+        }
+        protocol = 0;
+    }
+
+    if (protocol < 0x200 || !(header[0x211] & 0x01)) {
+        /* Low kernel */
+        real_addr    = 0x90000;
+        cmdline_addr = 0x9a000 - cmdline_size;
+        prot_addr    = 0x10000;
+    } else if (protocol < 0x202) {
+        /* High but ancient kernel */
+        real_addr    = 0x90000;
+        cmdline_addr = 0x9a000 - cmdline_size;
+        prot_addr    = 0x100000;
+    } else {
+        /* High and recent kernel */
+        real_addr    = 0x10000;
+        cmdline_addr = 0x20000;
+        prot_addr    = 0x100000;
+    }
+
+#if 0
+    fprintf(stderr,
+            "qemu: real_addr     = 0x" TARGET_FMT_plx "\n"
+            "qemu: cmdline_addr  = 0x" TARGET_FMT_plx "\n"
+            "qemu: prot_addr     = 0x" TARGET_FMT_plx "\n",
+            real_addr,
+            cmdline_addr,
+            prot_addr);
+#endif
+
+    /* highest address for loading the initrd */
+    if (protocol >= 0x20c &&
+        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
+        /*
+         * Linux has supported initrd up to 4 GB for a very long time (2007,
+         * long before XLF_CAN_BE_LOADED_ABOVE_4G which was added in 2013),
+         * though it only sets initrd_max to 2 GB to "work around bootloader
+         * bugs". Luckily, QEMU firmware(which does something like bootloader)
+         * has supported this.
+         *
+         * It's believed that if XLF_CAN_BE_LOADED_ABOVE_4G is set, initrd can
+         * be loaded into any address.
+         *
+         * In addition, initrd_max is uint32_t simply because QEMU doesn't
+         * support the 64-bit boot protocol (specifically the ext_ramdisk_image
+         * field).
+         *
+         * Therefore here just limit initrd_max to UINT32_MAX simply as well.
+         */
+        initrd_max = UINT32_MAX;
+    } else if (protocol >= 0x203) {
+        initrd_max = ldl_p(header+0x22c);
+    } else {
+        initrd_max = 0x37ffffff;
+    }
+
+    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
+        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
+    }
+
+    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline)+1);
+    fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
+
+    if (protocol >= 0x202) {
+        stl_p(header+0x228, cmdline_addr);
+    } else {
+        stw_p(header+0x20, 0xA33F);
+        stw_p(header+0x22, cmdline_addr-real_addr);
+    }
+
+    /* handle vga= parameter */
+    vmode = strstr(kernel_cmdline, "vga=");
+    if (vmode) {
+        unsigned int video_mode;
+        /* skip "vga=" */
+        vmode += 4;
+        if (!strncmp(vmode, "normal", 6)) {
+            video_mode = 0xffff;
+        } else if (!strncmp(vmode, "ext", 3)) {
+            video_mode = 0xfffe;
+        } else if (!strncmp(vmode, "ask", 3)) {
+            video_mode = 0xfffd;
+        } else {
+            video_mode = strtol(vmode, NULL, 0);
+        }
+        stw_p(header+0x1fa, video_mode);
+    }
+
+    /* loader type */
+    /* High nybble = B reserved for QEMU; low nybble is revision number.
+       If this code is substantially changed, you may want to consider
+       incrementing the revision. */
+    if (protocol >= 0x200) {
+        header[0x210] = 0xB0;
+    }
+    /* heap */
+    if (protocol >= 0x201) {
+        header[0x211] |= 0x80;	/* CAN_USE_HEAP */
+        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
+    }
+
+    /* load initrd */
+    if (initrd_filename) {
+        GMappedFile *mapped_file;
+        gsize initrd_size;
+        gchar *initrd_data;
+        GError *gerr = NULL;
+
+        if (protocol < 0x200) {
+            fprintf(stderr, "qemu: linux kernel too old to load a ram disk\n");
+            exit(1);
+        }
+
+        mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
+        if (!mapped_file) {
+            fprintf(stderr, "qemu: error reading initrd %s: %s\n",
+                    initrd_filename, gerr->message);
+            exit(1);
+        }
+        pcms->initrd_mapped_file = mapped_file;
+
+        initrd_data = g_mapped_file_get_contents(mapped_file);
+        initrd_size = g_mapped_file_get_length(mapped_file);
+        if (initrd_size >= initrd_max) {
+            fprintf(stderr, "qemu: initrd is too large, cannot support."
+                    "(max: %"PRIu32", need %"PRId64")\n",
+                    initrd_max, (uint64_t)initrd_size);
+            exit(1);
+        }
+
+        initrd_addr = (initrd_max-initrd_size) & ~4095;
+
+        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
+        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
+        fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
+
+        stl_p(header+0x218, initrd_addr);
+        stl_p(header+0x21c, initrd_size);
+    }
+
+    /* load kernel and setup */
+    setup_size = header[0x1f1];
+    if (setup_size == 0) {
+        setup_size = 4;
+    }
+    setup_size = (setup_size+1)*512;
+    if (setup_size > kernel_size) {
+        fprintf(stderr, "qemu: invalid kernel header\n");
+        exit(1);
+    }
+    kernel_size -= setup_size;
+
+    setup  = g_malloc(setup_size);
+    kernel = g_malloc(kernel_size);
+    fseek(f, 0, SEEK_SET);
+    if (fread(setup, 1, setup_size, f) != setup_size) {
+        fprintf(stderr, "fread() failed\n");
+        exit(1);
+    }
+    if (fread(kernel, 1, kernel_size, f) != kernel_size) {
+        fprintf(stderr, "fread() failed\n");
+        exit(1);
+    }
+    fclose(f);
+
+    /* append dtb to kernel */
+    if (dtb_filename) {
+        if (protocol < 0x209) {
+            fprintf(stderr, "qemu: Linux kernel too old to load a dtb\n");
+            exit(1);
+        }
+
+        dtb_size = get_image_size(dtb_filename);
+        if (dtb_size <= 0) {
+            fprintf(stderr, "qemu: error reading dtb %s: %s\n",
+                    dtb_filename, strerror(errno));
+            exit(1);
+        }
+
+        setup_data_offset = QEMU_ALIGN_UP(kernel_size, 16);
+        kernel_size = setup_data_offset + sizeof(struct setup_data) + dtb_size;
+        kernel = g_realloc(kernel, kernel_size);
+
+        stq_p(header+0x250, prot_addr + setup_data_offset);
+
+        setup_data = (struct setup_data *)(kernel + setup_data_offset);
+        setup_data->next = 0;
+        setup_data->type = cpu_to_le32(SETUP_DTB);
+        setup_data->len = cpu_to_le32(dtb_size);
+
+        load_image_size(dtb_filename, setup_data->data, dtb_size);
+    }
+
+    memcpy(setup, header, MIN(sizeof(header), setup_size));
+
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, prot_addr);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, kernel_size);
+    fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, kernel, kernel_size);
+
+    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_ADDR, real_addr);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
+    fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
+
+    option_rom[nb_option_roms].bootindex = 0;
+    option_rom[nb_option_roms].name = "linuxboot.bin";
+    if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
+        option_rom[nb_option_roms].name = "linuxboot_dma.bin";
+    }
+    nb_option_roms++;
+}
+
+void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
+{
+    char *filename;
+    MemoryRegion *bios, *isa_bios;
+    int bios_size, isa_bios_size;
+    int ret;
+
+    /* BIOS load */
+    if (bios_name == NULL) {
+        bios_name = BIOS_FILENAME;
+    }
+    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
+    if (filename) {
+        bios_size = get_image_size(filename);
+    } else {
+        bios_size = -1;
+    }
+    if (bios_size <= 0 ||
+        (bios_size % 65536) != 0) {
+        goto bios_error;
+    }
+    bios = g_malloc(sizeof(*bios));
+    memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
+    if (!isapc_ram_fw) {
+        memory_region_set_readonly(bios, true);
+    }
+    ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
+    if (ret != 0) {
+    bios_error:
+        fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
+        exit(1);
+    }
+    g_free(filename);
+
+    /* map the last 128KB of the BIOS in ISA space */
+    isa_bios_size = MIN(bios_size, 128 * KiB);
+    isa_bios = g_malloc(sizeof(*isa_bios));
+    memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
+                             bios_size - isa_bios_size, isa_bios_size);
+    memory_region_add_subregion_overlap(rom_memory,
+                                        0x100000 - isa_bios_size,
+                                        isa_bios,
+                                        1);
+    if (!isapc_ram_fw) {
+        memory_region_set_readonly(isa_bios, true);
+    }
+
+    /* map all the bios at the top of memory */
+    memory_region_add_subregion(rom_memory,
+                                (uint32_t)(-bios_size),
+                                bios);
+}
diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index d3374e0831..7ed80a4853 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -1,5 +1,6 @@
 obj-$(CONFIG_KVM) += kvm/
 obj-y += e820_memory_layout.o multiboot.o
+obj-y += x86.o
 obj-y += pc.o
 obj-$(CONFIG_I440FX) += pc_piix.o
 obj-$(CONFIG_Q35) += pc_q35.o
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 04/10] hw/i386: split PCMachineState deriving X86MachineState from it
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (2 preceding siblings ...)
  2019-10-04  9:37 ` [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them Sergio Lopez
@ 2019-10-04  9:37 ` Sergio Lopez
  2019-10-04  9:49   ` Philippe Mathieu-Daudé
  2019-10-04  9:37 ` [PATCH v6 05/10] hw/i386: make x86.c independent from PCMachineState Sergio Lopez
                   ` (9 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

Split up PCMachineState and PCMachineClass and derive X86MachineState
and X86MachineClass from them. This allows sharing code with non-PC
x86 machine types.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 include/hw/i386/pc.h  |  27 +------
 include/hw/i386/x86.h |  56 ++++++++++++-
 hw/acpi/cpu_hotplug.c |  10 +--
 hw/i386/acpi-build.c  |  29 ++++---
 hw/i386/amd_iommu.c   |   3 +-
 hw/i386/intel_iommu.c |   3 +-
 hw/i386/pc.c          | 178 ++++++++++++++----------------------------
 hw/i386/pc_piix.c     |  43 +++++-----
 hw/i386/pc_q35.c      |  35 +++++----
 hw/i386/x86.c         | 139 +++++++++++++++++++++++++++++----
 hw/i386/xen/xen-hvm.c |  23 +++---
 hw/intc/ioapic.c      |   2 +-
 12 files changed, 320 insertions(+), 228 deletions(-)

diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
index 73e2847e87..d2a690d05e 100644
--- a/include/hw/i386/pc.h
+++ b/include/hw/i386/pc.h
@@ -8,6 +8,7 @@
 #include "hw/block/flash.h"
 #include "net/net.h"
 #include "hw/i386/ioapic.h"
+#include "hw/i386/x86.h"
 
 #include "qemu/range.h"
 #include "qemu/bitmap.h"
@@ -27,7 +28,7 @@
  */
 struct PCMachineState {
     /*< private >*/
-    MachineState parent_obj;
+    X86MachineState parent_obj;
 
     /* <public> */
 
@@ -36,16 +37,11 @@ struct PCMachineState {
 
     /* Pointers to devices and objects: */
     HotplugHandler *acpi_dev;
-    ISADevice *rtc;
     PCIBus *bus;
     I2CBus *smbus;
-    FWCfgState *fw_cfg;
-    qemu_irq *gsi;
     PFlashCFI01 *flash[2];
-    GMappedFile *initrd_mapped_file;
 
     /* Configuration options: */
-    uint64_t max_ram_below_4g;
     OnOffAuto vmport;
     OnOffAuto smm;
 
@@ -54,27 +50,13 @@ struct PCMachineState {
     bool sata_enabled;
     bool pit_enabled;
 
-    /* RAM information (sizes, addresses, configuration): */
-    ram_addr_t below_4g_mem_size, above_4g_mem_size;
-
-    /* CPU and apic information: */
-    bool apic_xrupt_override;
-    unsigned apic_id_limit;
-    uint16_t boot_cpus;
-    unsigned smp_dies;
-
     /* NUMA information: */
     uint64_t numa_nodes;
     uint64_t *node_mem;
-
-    /* Address space used by IOAPIC device. All IOAPIC interrupts
-     * will be translated to MSI messages in the address space. */
-    AddressSpace *ioapic_as;
 };
 
 #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
 #define PC_MACHINE_DEVMEM_REGION_SIZE "device-memory-region-size"
-#define PC_MACHINE_MAX_RAM_BELOW_4G "max-ram-below-4g"
 #define PC_MACHINE_VMPORT           "vmport"
 #define PC_MACHINE_SMM              "smm"
 #define PC_MACHINE_SMBUS            "smbus"
@@ -99,7 +81,7 @@ struct PCMachineState {
  */
 typedef struct PCMachineClass {
     /*< private >*/
-    MachineClass parent_class;
+    X86MachineClass parent_class;
 
     /*< public >*/
 
@@ -141,9 +123,6 @@ typedef struct PCMachineClass {
 
     /* use PVH to load kernels that support this feature */
     bool pvh_enabled;
-
-    /* Enables contiguous-apic-ID mode */
-    bool compat_apic_id_mode;
 } PCMachineClass;
 
 #define TYPE_PC_MACHINE "generic-pc-machine"
diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index 71e2b6985d..a930a7ad9d 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -17,7 +17,61 @@
 #ifndef HW_I386_X86_H
 #define HW_I386_X86_H
 
+#include "qemu-common.h"
+#include "exec/hwaddr.h"
+#include "qemu/notify.h"
+
 #include "hw/boards.h"
+#include "hw/nmi.h"
+
+typedef struct {
+    /*< private >*/
+    MachineClass parent;
+
+    /*< public >*/
+
+    /* Enables contiguous-apic-ID mode */
+    bool compat_apic_id_mode;
+} X86MachineClass;
+
+typedef struct {
+    /*< private >*/
+    MachineState parent;
+
+    /*< public >*/
+
+    /* Pointers to devices and objects: */
+    ISADevice *rtc;
+    FWCfgState *fw_cfg;
+    qemu_irq *gsi;
+    GMappedFile *initrd_mapped_file;
+
+    /* Configuration options: */
+    uint64_t max_ram_below_4g;
+
+    /* RAM information (sizes, addresses, configuration): */
+    ram_addr_t below_4g_mem_size, above_4g_mem_size;
+
+    /* CPU and apic information: */
+    bool apic_xrupt_override;
+    unsigned apic_id_limit;
+    uint16_t boot_cpus;
+    unsigned smp_dies;
+
+    /* Address space used by IOAPIC device. All IOAPIC interrupts
+     * will be translated to MSI messages in the address space. */
+    AddressSpace *ioapic_as;
+} X86MachineState;
+
+#define X86_MACHINE_MAX_RAM_BELOW_4G "max-ram-below-4g"
+
+#define TYPE_X86_MACHINE   MACHINE_TYPE_NAME("x86")
+#define X86_MACHINE(obj) \
+    OBJECT_CHECK(X86MachineState, (obj), TYPE_X86_MACHINE)
+#define X86_MACHINE_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(X86MachineClass, obj, TYPE_X86_MACHINE)
+#define X86_MACHINE_CLASS(class) \
+    OBJECT_CLASS_CHECK(X86MachineClass, class, TYPE_X86_MACHINE)
 
 uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
                                     unsigned int cpu_index);
@@ -30,6 +84,6 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
 
 void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw);
 
-void x86_load_linux(PCMachineState *x86ms, FWCfgState *fw_cfg);
+void x86_load_linux(PCMachineState *pcms, FWCfgState *fw_cfg);
 
 #endif
diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c
index 6e8293aac9..3ac2045a95 100644
--- a/hw/acpi/cpu_hotplug.c
+++ b/hw/acpi/cpu_hotplug.c
@@ -128,7 +128,7 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
     Aml *one = aml_int(1);
     MachineClass *mc = MACHINE_GET_CLASS(machine);
     const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
-    PCMachineState *pcms = PC_MACHINE(machine);
+    X86MachineState *x86ms = X86_MACHINE(machine);
 
     /*
      * _MAT method - creates an madt apic buffer
@@ -236,9 +236,9 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
     /* The current AML generator can cover the APIC ID range [0..255],
      * inclusive, for VCPU hotplug. */
     QEMU_BUILD_BUG_ON(ACPI_CPU_HOTPLUG_ID_LIMIT > 256);
-    if (pcms->apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) {
+    if (x86ms->apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) {
         error_report("max_cpus is too large. APIC ID of last CPU is %u",
-                     pcms->apic_id_limit - 1);
+                     x86ms->apic_id_limit - 1);
         exit(1);
     }
 
@@ -315,8 +315,8 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
      * ith up to 255 elements. Windows guests up to win2k8 fail when
      * VarPackageOp is used.
      */
-    pkg = pcms->apic_id_limit <= 255 ? aml_package(pcms->apic_id_limit) :
-                                       aml_varpackage(pcms->apic_id_limit);
+    pkg = x86ms->apic_id_limit <= 255 ? aml_package(x86ms->apic_id_limit) :
+                                        aml_varpackage(x86ms->apic_id_limit);
 
     for (i = 0, apic_idx = 0; i < apic_ids->len; i++) {
         int apic_id = apic_ids->cpus[i].arch_id;
diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index 4e0f9f425a..fc7de46533 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -361,6 +361,7 @@ static void
 build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms)
 {
     MachineClass *mc = MACHINE_GET_CLASS(pcms);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
     const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(MACHINE(pcms));
     int madt_start = table_data->len;
     AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(pcms->acpi_dev);
@@ -390,7 +391,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms)
     io_apic->address = cpu_to_le32(IO_APIC_DEFAULT_ADDRESS);
     io_apic->interrupt = cpu_to_le32(0);
 
-    if (pcms->apic_xrupt_override) {
+    if (x86ms->apic_xrupt_override) {
         intsrcovr = acpi_data_push(table_data, sizeof *intsrcovr);
         intsrcovr->type   = ACPI_APIC_XRUPT_OVERRIDE;
         intsrcovr->length = sizeof(*intsrcovr);
@@ -1831,6 +1832,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
     CrsRangeSet crs_range_set;
     PCMachineState *pcms = PC_MACHINE(machine);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
+    X86MachineState *x86ms = X86_MACHINE(machine);
     AcpiMcfgInfo mcfg;
     uint32_t nr_mem = machine->ram_slots;
     int root_bus_limit = 0xFF;
@@ -2098,7 +2100,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
          * with half of the 16-bit control register. Hence, the total size
          * of the i/o region used is FW_CFG_CTL_SIZE; when using DMA, the
          * DMA control register is located at FW_CFG_DMA_IO_BASE + 4 */
-        uint8_t io_size = object_property_get_bool(OBJECT(pcms->fw_cfg),
+        uint8_t io_size = object_property_get_bool(OBJECT(x86ms->fw_cfg),
                                                    "dma_enabled", NULL) ?
                           ROUND_UP(FW_CFG_CTL_SIZE, 4) + sizeof(dma_addr_t) :
                           FW_CFG_CTL_SIZE;
@@ -2331,6 +2333,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
     int srat_start, numa_start, slots;
     uint64_t mem_len, mem_base, next_base;
     MachineClass *mc = MACHINE_GET_CLASS(machine);
+    X86MachineState *x86ms = X86_MACHINE(machine);
     const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
     PCMachineState *pcms = PC_MACHINE(machine);
     ram_addr_t hotplugabble_address_space_size =
@@ -2401,16 +2404,16 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
         }
 
         /* Cut out the ACPI_PCI hole */
-        if (mem_base <= pcms->below_4g_mem_size &&
-            next_base > pcms->below_4g_mem_size) {
-            mem_len -= next_base - pcms->below_4g_mem_size;
+        if (mem_base <= x86ms->below_4g_mem_size &&
+            next_base > x86ms->below_4g_mem_size) {
+            mem_len -= next_base - x86ms->below_4g_mem_size;
             if (mem_len > 0) {
                 numamem = acpi_data_push(table_data, sizeof *numamem);
                 build_srat_memory(numamem, mem_base, mem_len, i - 1,
                                   MEM_AFFINITY_ENABLED);
             }
             mem_base = 1ULL << 32;
-            mem_len = next_base - pcms->below_4g_mem_size;
+            mem_len = next_base - x86ms->below_4g_mem_size;
             next_base = mem_base + mem_len;
         }
 
@@ -2629,6 +2632,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
 {
     PCMachineState *pcms = PC_MACHINE(machine);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+    X86MachineState *x86ms = X86_MACHINE(machine);
     GArray *table_offsets;
     unsigned facs, dsdt, rsdt, fadt;
     AcpiPmInfo pm;
@@ -2790,7 +2794,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
          */
         int legacy_aml_len =
             pcmc->legacy_acpi_table_size +
-            ACPI_BUILD_LEGACY_CPU_AML_SIZE * pcms->apic_id_limit;
+            ACPI_BUILD_LEGACY_CPU_AML_SIZE * x86ms->apic_id_limit;
         int legacy_table_size =
             ROUND_UP(tables_blob->len - aml_len + legacy_aml_len,
                      ACPI_BUILD_ALIGN_SIZE);
@@ -2880,13 +2884,14 @@ void acpi_setup(void)
 {
     PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
     AcpiBuildTables tables;
     AcpiBuildState *build_state;
     Object *vmgenid_dev;
     TPMIf *tpm;
     static FwCfgTPMConfig tpm_config;
 
-    if (!pcms->fw_cfg) {
+    if (!x86ms->fw_cfg) {
         ACPI_BUILD_DPRINTF("No fw cfg. Bailing out.\n");
         return;
     }
@@ -2917,7 +2922,7 @@ void acpi_setup(void)
         acpi_add_rom_blob(acpi_build_update, build_state,
                           tables.linker->cmd_blob, "etc/table-loader", 0);
 
-    fw_cfg_add_file(pcms->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
+    fw_cfg_add_file(x86ms->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
                     tables.tcpalog->data, acpi_data_len(tables.tcpalog));
 
     tpm = tpm_find();
@@ -2927,13 +2932,13 @@ void acpi_setup(void)
             .tpm_version = tpm_get_version(tpm),
             .tpmppi_version = TPM_PPI_VERSION_1_30
         };
-        fw_cfg_add_file(pcms->fw_cfg, "etc/tpm/config",
+        fw_cfg_add_file(x86ms->fw_cfg, "etc/tpm/config",
                         &tpm_config, sizeof tpm_config);
     }
 
     vmgenid_dev = find_vmgenid_dev();
     if (vmgenid_dev) {
-        vmgenid_add_fw_cfg(VMGENID(vmgenid_dev), pcms->fw_cfg,
+        vmgenid_add_fw_cfg(VMGENID(vmgenid_dev), x86ms->fw_cfg,
                            tables.vmgenid);
     }
 
@@ -2946,7 +2951,7 @@ void acpi_setup(void)
         uint32_t rsdp_size = acpi_data_len(tables.rsdp);
 
         build_state->rsdp = g_memdup(tables.rsdp->data, rsdp_size);
-        fw_cfg_add_file_callback(pcms->fw_cfg, ACPI_BUILD_RSDP_FILE,
+        fw_cfg_add_file_callback(x86ms->fw_cfg, ACPI_BUILD_RSDP_FILE,
                                  acpi_build_update, NULL, build_state,
                                  build_state->rsdp, rsdp_size, true);
         build_state->rsdp_mr = NULL;
diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
index 08884523e2..7b7e4a0bf7 100644
--- a/hw/i386/amd_iommu.c
+++ b/hw/i386/amd_iommu.c
@@ -1537,6 +1537,7 @@ static void amdvi_realize(DeviceState *dev, Error **err)
     X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
     MachineState *ms = MACHINE(qdev_get_machine());
     PCMachineState *pcms = PC_MACHINE(ms);
+    X86MachineState *x86ms = X86_MACHINE(ms);
     PCIBus *bus = pcms->bus;
 
     s->iotlb = g_hash_table_new_full(amdvi_uint64_hash,
@@ -1565,7 +1566,7 @@ static void amdvi_realize(DeviceState *dev, Error **err)
     }
 
     /* Pseudo address space under root PCI bus. */
-    pcms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);
+    x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);
 
     /* set up MMIO */
     memory_region_init_io(&s->mmio, OBJECT(s), &mmio_mem_ops, s, "amdvi-mmio",
diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
index f1de8fdb75..9dc20c160e 100644
--- a/hw/i386/intel_iommu.c
+++ b/hw/i386/intel_iommu.c
@@ -3731,6 +3731,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
 {
     MachineState *ms = MACHINE(qdev_get_machine());
     PCMachineState *pcms = PC_MACHINE(ms);
+    X86MachineState *x86ms = X86_MACHINE(ms);
     PCIBus *bus = pcms->bus;
     IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
     X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
@@ -3771,7 +3772,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
     sysbus_mmio_map(SYS_BUS_DEVICE(s), 0, Q35_HOST_BRIDGE_IOMMU_ADDR);
     pci_setup_iommu(bus, vtd_host_dma_iommu, dev);
     /* Pseudo address space under root PCI bus. */
-    pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
+    x86ms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
     qemu_add_machine_init_done_notifier(&vtd_machine_done_notify);
 }
 
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 094db79fb0..0dc1420a1f 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -78,7 +78,6 @@
 #include "qapi/qapi-visit-common.h"
 #include "qapi/visitor.h"
 #include "hw/core/cpu.h"
-#include "hw/nmi.h"
 #include "hw/usb.h"
 #include "hw/i386/intel_iommu.h"
 #include "hw/net/ne2000-isa.h"
@@ -679,17 +678,18 @@ void pc_cmos_init(PCMachineState *pcms,
 {
     int val;
     static pc_cmos_init_late_arg arg;
+    X86MachineState *x86ms = X86_MACHINE(pcms);
 
     /* various important CMOS locations needed by PC/Bochs bios */
 
     /* memory size */
     /* base memory (first MiB) */
-    val = MIN(pcms->below_4g_mem_size / KiB, 640);
+    val = MIN(x86ms->below_4g_mem_size / KiB, 640);
     rtc_set_memory(s, 0x15, val);
     rtc_set_memory(s, 0x16, val >> 8);
     /* extended memory (next 64MiB) */
-    if (pcms->below_4g_mem_size > 1 * MiB) {
-        val = (pcms->below_4g_mem_size - 1 * MiB) / KiB;
+    if (x86ms->below_4g_mem_size > 1 * MiB) {
+        val = (x86ms->below_4g_mem_size - 1 * MiB) / KiB;
     } else {
         val = 0;
     }
@@ -700,8 +700,8 @@ void pc_cmos_init(PCMachineState *pcms,
     rtc_set_memory(s, 0x30, val);
     rtc_set_memory(s, 0x31, val >> 8);
     /* memory between 16MiB and 4GiB */
-    if (pcms->below_4g_mem_size > 16 * MiB) {
-        val = (pcms->below_4g_mem_size - 16 * MiB) / (64 * KiB);
+    if (x86ms->below_4g_mem_size > 16 * MiB) {
+        val = (x86ms->below_4g_mem_size - 16 * MiB) / (64 * KiB);
     } else {
         val = 0;
     }
@@ -710,14 +710,14 @@ void pc_cmos_init(PCMachineState *pcms,
     rtc_set_memory(s, 0x34, val);
     rtc_set_memory(s, 0x35, val >> 8);
     /* memory above 4GiB */
-    val = pcms->above_4g_mem_size / 65536;
+    val = x86ms->above_4g_mem_size / 65536;
     rtc_set_memory(s, 0x5b, val);
     rtc_set_memory(s, 0x5c, val >> 8);
     rtc_set_memory(s, 0x5d, val >> 16);
 
     object_property_add_link(OBJECT(pcms), "rtc_state",
                              TYPE_ISA_DEVICE,
-                             (Object **)&pcms->rtc,
+                             (Object **)&x86ms->rtc,
                              object_property_allow_set_link,
                              OBJ_PROP_LINK_STRONG, &error_abort);
     object_property_set_link(OBJECT(pcms), OBJECT(s),
@@ -906,7 +906,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
  */
 void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 {
-    PCMachineState *pcms = PC_MACHINE(ms);
+    X86MachineState *x86ms = X86_MACHINE(ms);
 
     if (opts) {
         unsigned cpus    = qemu_opt_get_number(opts, "cpus", 0);
@@ -970,7 +970,7 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
         ms->smp.cpus = cpus;
         ms->smp.cores = cores;
         ms->smp.threads = threads;
-        pcms->smp_dies = dies;
+        x86ms->smp_dies = dies;
     }
 
     if (ms->smp.cpus > 1) {
@@ -1023,10 +1023,11 @@ void pc_machine_done(Notifier *notifier, void *data)
 {
     PCMachineState *pcms = container_of(notifier,
                                         PCMachineState, machine_done);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
     PCIBus *bus = pcms->bus;
 
     /* set the number of CPUs */
-    rtc_set_cpus_count(pcms->rtc, pcms->boot_cpus);
+    rtc_set_cpus_count(x86ms->rtc, x86ms->boot_cpus);
 
     if (bus) {
         int extra_hosts = 0;
@@ -1037,23 +1038,23 @@ void pc_machine_done(Notifier *notifier, void *data)
                 extra_hosts++;
             }
         }
-        if (extra_hosts && pcms->fw_cfg) {
+        if (extra_hosts && x86ms->fw_cfg) {
             uint64_t *val = g_malloc(sizeof(*val));
             *val = cpu_to_le64(extra_hosts);
-            fw_cfg_add_file(pcms->fw_cfg,
+            fw_cfg_add_file(x86ms->fw_cfg,
                     "etc/extra-pci-roots", val, sizeof(*val));
         }
     }
 
     acpi_setup();
-    if (pcms->fw_cfg) {
-        fw_cfg_build_smbios(MACHINE(pcms), pcms->fw_cfg);
-        fw_cfg_build_feature_control(MACHINE(pcms), pcms->fw_cfg);
+    if (x86ms->fw_cfg) {
+        fw_cfg_build_smbios(MACHINE(pcms), x86ms->fw_cfg);
+        fw_cfg_build_feature_control(MACHINE(pcms), x86ms->fw_cfg);
         /* update FW_CFG_NB_CPUS to account for -device added CPUs */
-        fw_cfg_modify_i16(pcms->fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
+        fw_cfg_modify_i16(x86ms->fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
     }
 
-    if (pcms->apic_id_limit > 255 && !xen_enabled()) {
+    if (x86ms->apic_id_limit > 255 && !xen_enabled()) {
         IntelIOMMUState *iommu = INTEL_IOMMU_DEVICE(x86_iommu_get_default());
 
         if (!iommu || !x86_iommu_ir_supported(X86_IOMMU_DEVICE(iommu)) ||
@@ -1071,8 +1072,9 @@ void pc_guest_info_init(PCMachineState *pcms)
 {
     int i;
     MachineState *ms = MACHINE(pcms);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
 
-    pcms->apic_xrupt_override = kvm_allows_irq0_override();
+    x86ms->apic_xrupt_override = kvm_allows_irq0_override();
     pcms->numa_nodes = ms->numa_state->num_nodes;
     pcms->node_mem = g_malloc0(pcms->numa_nodes *
                                     sizeof *pcms->node_mem);
@@ -1097,11 +1099,12 @@ void xen_load_linux(PCMachineState *pcms)
 {
     int i;
     FWCfgState *fw_cfg;
+    X86MachineState *x86ms = X86_MACHINE(pcms);
 
     assert(MACHINE(pcms)->kernel_filename != NULL);
 
     fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE);
-    fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
+    fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
     rom_set_fw(fw_cfg);
 
     x86_load_linux(pcms, fw_cfg);
@@ -1112,7 +1115,7 @@ void xen_load_linux(PCMachineState *pcms)
                !strcmp(option_rom[i].name, "multiboot.bin"));
         rom_add_option(option_rom[i].name, option_rom[i].bootindex);
     }
-    pcms->fw_cfg = fw_cfg;
+    x86ms->fw_cfg = fw_cfg;
 }
 
 void pc_memory_init(PCMachineState *pcms,
@@ -1127,9 +1130,10 @@ void pc_memory_init(PCMachineState *pcms,
     MachineState *machine = MACHINE(pcms);
     MachineClass *mc = MACHINE_GET_CLASS(machine);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
 
-    assert(machine->ram_size == pcms->below_4g_mem_size +
-                                pcms->above_4g_mem_size);
+    assert(machine->ram_size == x86ms->below_4g_mem_size +
+                                x86ms->above_4g_mem_size);
 
     linux_boot = (machine->kernel_filename != NULL);
 
@@ -1143,17 +1147,17 @@ void pc_memory_init(PCMachineState *pcms,
     *ram_memory = ram;
     ram_below_4g = g_malloc(sizeof(*ram_below_4g));
     memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram,
-                             0, pcms->below_4g_mem_size);
+                             0, x86ms->below_4g_mem_size);
     memory_region_add_subregion(system_memory, 0, ram_below_4g);
-    e820_add_entry(0, pcms->below_4g_mem_size, E820_RAM);
-    if (pcms->above_4g_mem_size > 0) {
+    e820_add_entry(0, x86ms->below_4g_mem_size, E820_RAM);
+    if (x86ms->above_4g_mem_size > 0) {
         ram_above_4g = g_malloc(sizeof(*ram_above_4g));
         memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
-                                 pcms->below_4g_mem_size,
-                                 pcms->above_4g_mem_size);
+                                 x86ms->below_4g_mem_size,
+                                 x86ms->above_4g_mem_size);
         memory_region_add_subregion(system_memory, 0x100000000ULL,
                                     ram_above_4g);
-        e820_add_entry(0x100000000ULL, pcms->above_4g_mem_size, E820_RAM);
+        e820_add_entry(0x100000000ULL, x86ms->above_4g_mem_size, E820_RAM);
     }
 
     if (!pcmc->has_reserved_memory &&
@@ -1187,7 +1191,7 @@ void pc_memory_init(PCMachineState *pcms,
         }
 
         machine->device_memory->base =
-            ROUND_UP(0x100000000ULL + pcms->above_4g_mem_size, 1 * GiB);
+            ROUND_UP(0x100000000ULL + x86ms->above_4g_mem_size, 1 * GiB);
 
         if (pcmc->enforce_aligned_dimm) {
             /* size device region assuming 1G page max alignment per slot */
@@ -1222,7 +1226,7 @@ void pc_memory_init(PCMachineState *pcms,
                                         1);
 
     fw_cfg = fw_cfg_arch_create(machine,
-                                pcms->boot_cpus, pcms->apic_id_limit);
+                                x86ms->boot_cpus, x86ms->apic_id_limit);
 
     rom_set_fw(fw_cfg);
 
@@ -1245,10 +1249,10 @@ void pc_memory_init(PCMachineState *pcms,
     for (i = 0; i < nb_option_roms; i++) {
         rom_add_option(option_rom[i].name, option_rom[i].bootindex);
     }
-    pcms->fw_cfg = fw_cfg;
+    x86ms->fw_cfg = fw_cfg;
 
     /* Init default IOAPIC address space */
-    pcms->ioapic_as = &address_space_memory;
+    x86ms->ioapic_as = &address_space_memory;
 }
 
 /*
@@ -1260,6 +1264,7 @@ uint64_t pc_pci_hole64_start(void)
     PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
     MachineState *ms = MACHINE(pcms);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
     uint64_t hole64_start = 0;
 
     if (pcmc->has_reserved_memory && ms->device_memory->base) {
@@ -1268,7 +1273,7 @@ uint64_t pc_pci_hole64_start(void)
             hole64_start += memory_region_size(&ms->device_memory->mr);
         }
     } else {
-        hole64_start = 0x100000000ULL + pcms->above_4g_mem_size;
+        hole64_start = 0x100000000ULL + x86ms->above_4g_mem_size;
     }
 
     return ROUND_UP(hole64_start, 1 * GiB);
@@ -1607,6 +1612,7 @@ static void pc_cpu_plug(HotplugHandler *hotplug_dev,
     Error *local_err = NULL;
     X86CPU *cpu = X86_CPU(dev);
     PCMachineState *pcms = PC_MACHINE(hotplug_dev);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
 
     if (pcms->acpi_dev) {
         hotplug_handler_plug(HOTPLUG_HANDLER(pcms->acpi_dev), dev, &local_err);
@@ -1616,12 +1622,12 @@ static void pc_cpu_plug(HotplugHandler *hotplug_dev,
     }
 
     /* increment the number of CPUs */
-    pcms->boot_cpus++;
-    if (pcms->rtc) {
-        rtc_set_cpus_count(pcms->rtc, pcms->boot_cpus);
+    x86ms->boot_cpus++;
+    if (x86ms->rtc) {
+        rtc_set_cpus_count(x86ms->rtc, x86ms->boot_cpus);
     }
-    if (pcms->fw_cfg) {
-        fw_cfg_modify_i16(pcms->fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
+    if (x86ms->fw_cfg) {
+        fw_cfg_modify_i16(x86ms->fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
     }
 
     found_cpu = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, NULL);
@@ -1667,6 +1673,7 @@ static void pc_cpu_unplug_cb(HotplugHandler *hotplug_dev,
     Error *local_err = NULL;
     X86CPU *cpu = X86_CPU(dev);
     PCMachineState *pcms = PC_MACHINE(hotplug_dev);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
 
     hotplug_handler_unplug(HOTPLUG_HANDLER(pcms->acpi_dev), dev, &local_err);
     if (local_err) {
@@ -1678,10 +1685,10 @@ static void pc_cpu_unplug_cb(HotplugHandler *hotplug_dev,
     object_property_set_bool(OBJECT(dev), false, "realized", NULL);
 
     /* decrement the number of CPUs */
-    pcms->boot_cpus--;
+    x86ms->boot_cpus--;
     /* Update the number of CPUs in CMOS */
-    rtc_set_cpus_count(pcms->rtc, pcms->boot_cpus);
-    fw_cfg_modify_i16(pcms->fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
+    rtc_set_cpus_count(x86ms->rtc, x86ms->boot_cpus);
+    fw_cfg_modify_i16(x86ms->fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
  out:
     error_propagate(errp, local_err);
 }
@@ -1697,6 +1704,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     CPUX86State *env = &cpu->env;
     MachineState *ms = MACHINE(hotplug_dev);
     PCMachineState *pcms = PC_MACHINE(hotplug_dev);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
     unsigned int smp_cores = ms->smp.cores;
     unsigned int smp_threads = ms->smp.threads;
 
@@ -1706,7 +1714,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         return;
     }
 
-    env->nr_dies = pcms->smp_dies;
+    env->nr_dies = x86ms->smp_dies;
 
     /*
      * If APIC ID is not set,
@@ -1714,13 +1722,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
      */
     if (cpu->apic_id == UNASSIGNED_APIC_ID) {
         int max_socket = (ms->smp.max_cpus - 1) /
-                                smp_threads / smp_cores / pcms->smp_dies;
+                                smp_threads / smp_cores / x86ms->smp_dies;
 
         /*
          * die-id was optional in QEMU 4.0 and older, so keep it optional
          * if there's only one die per socket.
          */
-        if (cpu->die_id < 0 && pcms->smp_dies == 1) {
+        if (cpu->die_id < 0 && x86ms->smp_dies == 1) {
             cpu->die_id = 0;
         }
 
@@ -1735,9 +1743,9 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         if (cpu->die_id < 0) {
             error_setg(errp, "CPU die-id is not set");
             return;
-        } else if (cpu->die_id > pcms->smp_dies - 1) {
+        } else if (cpu->die_id > x86ms->smp_dies - 1) {
             error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
-                       cpu->die_id, pcms->smp_dies - 1);
+                       cpu->die_id, x86ms->smp_dies - 1);
             return;
         }
         if (cpu->core_id < 0) {
@@ -1761,7 +1769,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
         topo.die_id = cpu->die_id;
         topo.core_id = cpu->core_id;
         topo.smt_id = cpu->thread_id;
-        cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
+        cpu->apic_id = apicid_from_topo_ids(x86ms->smp_dies, smp_cores,
                                             smp_threads, &topo);
     }
 
@@ -1769,7 +1777,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     if (!cpu_slot) {
         MachineState *ms = MACHINE(pcms);
 
-        x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
+        x86_topo_ids_from_apicid(cpu->apic_id, x86ms->smp_dies,
                                  smp_cores, smp_threads, &topo);
         error_setg(errp,
             "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
@@ -1791,7 +1799,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
     /* TODO: move socket_id/core_id/thread_id checks into x86_cpu_realizefn()
      * once -smp refactoring is complete and there will be CPU private
      * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
-    x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
+    x86_topo_ids_from_apicid(cpu->apic_id, x86ms->smp_dies,
                              smp_cores, smp_threads, &topo);
     if (cpu->socket_id != -1 && cpu->socket_id != topo.pkg_id) {
         error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
@@ -1973,45 +1981,6 @@ pc_machine_get_device_memory_region_size(Object *obj, Visitor *v,
     visit_type_int(v, name, &value, errp);
 }
 
-static void pc_machine_get_max_ram_below_4g(Object *obj, Visitor *v,
-                                            const char *name, void *opaque,
-                                            Error **errp)
-{
-    PCMachineState *pcms = PC_MACHINE(obj);
-    uint64_t value = pcms->max_ram_below_4g;
-
-    visit_type_size(v, name, &value, errp);
-}
-
-static void pc_machine_set_max_ram_below_4g(Object *obj, Visitor *v,
-                                            const char *name, void *opaque,
-                                            Error **errp)
-{
-    PCMachineState *pcms = PC_MACHINE(obj);
-    Error *error = NULL;
-    uint64_t value;
-
-    visit_type_size(v, name, &value, &error);
-    if (error) {
-        error_propagate(errp, error);
-        return;
-    }
-    if (value > 4 * GiB) {
-        error_setg(&error,
-                   "Machine option 'max-ram-below-4g=%"PRIu64
-                   "' expects size less than or equal to 4G", value);
-        error_propagate(errp, error);
-        return;
-    }
-
-    if (value < 1 * MiB) {
-        warn_report("Only %" PRIu64 " bytes of RAM below the 4GiB boundary,"
-                    "BIOS may not work with less than 1MiB", value);
-    }
-
-    pcms->max_ram_below_4g = value;
-}
-
 static void pc_machine_get_vmport(Object *obj, Visitor *v, const char *name,
                                   void *opaque, Error **errp)
 {
@@ -2117,7 +2086,6 @@ static void pc_machine_initfn(Object *obj)
 {
     PCMachineState *pcms = PC_MACHINE(obj);
 
-    pcms->max_ram_below_4g = 0; /* use default */
     pcms->smm = ON_OFF_AUTO_AUTO;
 #ifdef CONFIG_VMPORT
     pcms->vmport = ON_OFF_AUTO_AUTO;
@@ -2129,7 +2097,6 @@ static void pc_machine_initfn(Object *obj)
     pcms->smbus_enabled = true;
     pcms->sata_enabled = true;
     pcms->pit_enabled = true;
-    pcms->smp_dies = 1;
 
     pc_system_flash_create(pcms);
 }
@@ -2160,23 +2127,6 @@ static void pc_machine_wakeup(MachineState *machine)
     cpu_synchronize_all_post_reset();
 }
 
-static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
-{
-    /* cpu index isn't used */
-    CPUState *cs;
-
-    CPU_FOREACH(cs) {
-        X86CPU *cpu = X86_CPU(cs);
-
-        if (!cpu->apic_state) {
-            cpu_interrupt(cs, CPU_INTERRUPT_NMI);
-        } else {
-            apic_deliver_nmi(cpu->apic_state);
-        }
-    }
-}
-
-
 static bool pc_hotplug_allowed(MachineState *ms, DeviceState *dev, Error **errp)
 {
     X86IOMMUState *iommu = x86_iommu_get_default();
@@ -2201,7 +2151,6 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     MachineClass *mc = MACHINE_CLASS(oc);
     PCMachineClass *pcmc = PC_MACHINE_CLASS(oc);
     HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
-    NMIClass *nc = NMI_CLASS(oc);
 
     pcmc->pci_enabled = true;
     pcmc->has_acpi_build = true;
@@ -2237,7 +2186,6 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
     hc->plug = pc_machine_device_plug_cb;
     hc->unplug_request = pc_machine_device_unplug_request_cb;
     hc->unplug = pc_machine_device_unplug_cb;
-    nc->nmi_monitor_handler = x86_nmi;
     mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
     mc->nvdimm_supported = true;
     mc->numa_mem_supported = true;
@@ -2246,13 +2194,6 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
         pc_machine_get_device_memory_region_size, NULL,
         NULL, NULL, &error_abort);
 
-    object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
-        pc_machine_get_max_ram_below_4g, pc_machine_set_max_ram_below_4g,
-        NULL, NULL, &error_abort);
-
-    object_class_property_set_description(oc, PC_MACHINE_MAX_RAM_BELOW_4G,
-        "Maximum ram below the 4G boundary (32bit boundary)", &error_abort);
-
     object_class_property_add(oc, PC_MACHINE_SMM, "OnOffAuto",
         pc_machine_get_smm, pc_machine_set_smm,
         NULL, NULL, &error_abort);
@@ -2277,7 +2218,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
 
 static const TypeInfo pc_machine_info = {
     .name = TYPE_PC_MACHINE,
-    .parent = TYPE_MACHINE,
+    .parent = TYPE_X86_MACHINE,
     .abstract = true,
     .instance_size = sizeof(PCMachineState),
     .instance_init = pc_machine_initfn,
@@ -2285,7 +2226,6 @@ static const TypeInfo pc_machine_info = {
     .class_init = pc_machine_class_init,
     .interfaces = (InterfaceInfo[]) {
          { TYPE_HOTPLUG_HANDLER },
-         { TYPE_NMI },
          { }
     },
 };
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 1396451abf..0afa8fe6ea 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -74,6 +74,7 @@ static void pc_init1(MachineState *machine,
 {
     PCMachineState *pcms = PC_MACHINE(machine);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+    X86MachineState *x86ms = X86_MACHINE(machine);
     MemoryRegion *system_memory = get_system_memory();
     MemoryRegion *system_io = get_system_io();
     int i;
@@ -126,11 +127,11 @@ static void pc_init1(MachineState *machine,
     if (xen_enabled()) {
         xen_hvm_init(pcms, &ram_memory);
     } else {
-        if (!pcms->max_ram_below_4g) {
-            pcms->max_ram_below_4g = 0xe0000000; /* default: 3.5G */
+        if (!x86ms->max_ram_below_4g) {
+            x86ms->max_ram_below_4g = 0xe0000000; /* default: 3.5G */
         }
-        lowmem = pcms->max_ram_below_4g;
-        if (machine->ram_size >= pcms->max_ram_below_4g) {
+        lowmem = x86ms->max_ram_below_4g;
+        if (machine->ram_size >= x86ms->max_ram_below_4g) {
             if (pcmc->gigabyte_align) {
                 if (lowmem > 0xc0000000) {
                     lowmem = 0xc0000000;
@@ -139,17 +140,17 @@ static void pc_init1(MachineState *machine,
                     warn_report("Large machine and max_ram_below_4g "
                                 "(%" PRIu64 ") not a multiple of 1G; "
                                 "possible bad performance.",
-                                pcms->max_ram_below_4g);
+                                x86ms->max_ram_below_4g);
                 }
             }
         }
 
         if (machine->ram_size >= lowmem) {
-            pcms->above_4g_mem_size = machine->ram_size - lowmem;
-            pcms->below_4g_mem_size = lowmem;
+            x86ms->above_4g_mem_size = machine->ram_size - lowmem;
+            x86ms->below_4g_mem_size = lowmem;
         } else {
-            pcms->above_4g_mem_size = 0;
-            pcms->below_4g_mem_size = machine->ram_size;
+            x86ms->above_4g_mem_size = 0;
+            x86ms->below_4g_mem_size = machine->ram_size;
         }
     }
 
@@ -191,19 +192,19 @@ static void pc_init1(MachineState *machine,
     gsi_state = g_malloc0(sizeof(*gsi_state));
     if (kvm_ioapic_in_kernel()) {
         kvm_pc_setup_irq_routing(pcmc->pci_enabled);
-        pcms->gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
+        x86ms->gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
                                        GSI_NUM_PINS);
     } else {
-        pcms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
+        x86ms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
     }
 
     if (pcmc->pci_enabled) {
         pci_bus = i440fx_init(host_type,
                               pci_type,
-                              &i440fx_state, &piix3_devfn, &isa_bus, pcms->gsi,
+                              &i440fx_state, &piix3_devfn, &isa_bus, x86ms->gsi,
                               system_memory, system_io, machine->ram_size,
-                              pcms->below_4g_mem_size,
-                              pcms->above_4g_mem_size,
+                              x86ms->below_4g_mem_size,
+                              x86ms->above_4g_mem_size,
                               pci_memory, ram_memory);
         pcms->bus = pci_bus;
     } else {
@@ -213,7 +214,7 @@ static void pc_init1(MachineState *machine,
                               &error_abort);
         no_hpet = 1;
     }
-    isa_bus_irqs(isa_bus, pcms->gsi);
+    isa_bus_irqs(isa_bus, x86ms->gsi);
 
     if (kvm_pic_in_kernel()) {
         i8259 = kvm_i8259_init(isa_bus);
@@ -231,7 +232,7 @@ static void pc_init1(MachineState *machine,
         ioapic_init_gsi(gsi_state, "i440fx");
     }
 
-    pc_register_ferr_irq(pcms->gsi[13]);
+    pc_register_ferr_irq(x86ms->gsi[13]);
 
     pc_vga_init(isa_bus, pcmc->pci_enabled ? pci_bus : NULL);
 
@@ -241,7 +242,7 @@ static void pc_init1(MachineState *machine,
     }
 
     /* init basic PC hardware */
-    pc_basic_device_init(isa_bus, pcms->gsi, &rtc_state, true,
+    pc_basic_device_init(isa_bus, x86ms->gsi, &rtc_state, true,
                          (pcms->vmport != ON_OFF_AUTO_ON), pcms->pit_enabled,
                          0x4);
 
@@ -288,7 +289,7 @@ else {
         smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
         /* TODO: Populate SPD eeprom data.  */
         pcms->smbus = piix4_pm_init(pci_bus, piix3_devfn + 3, 0xb100,
-                                    pcms->gsi[9], smi_irq,
+                                    x86ms->gsi[9], smi_irq,
                                     pc_machine_is_smm_enabled(pcms),
                                     &piix4_pm);
         smbus_eeprom_init(pcms->smbus, 8, NULL, 0);
@@ -304,7 +305,7 @@ else {
 
     if (machine->nvdimms_state->is_enabled) {
         nvdimm_init_acpi_state(machine->nvdimms_state, system_io,
-                               pcms->fw_cfg, OBJECT(pcms));
+                               x86ms->fw_cfg, OBJECT(pcms));
     }
 }
 
@@ -729,7 +730,7 @@ DEFINE_I440FX_MACHINE(v1_4, "pc-i440fx-1.4", pc_compat_1_4_fn,
 
 static void pc_i440fx_1_3_machine_options(MachineClass *m)
 {
-    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
+    X86MachineClass *x86mc = X86_MACHINE_CLASS(m);
     static GlobalProperty compat[] = {
         PC_CPU_MODEL_IDS("1.3.0")
         { "usb-tablet", "usb_version", "1" },
@@ -740,7 +741,7 @@ static void pc_i440fx_1_3_machine_options(MachineClass *m)
 
     pc_i440fx_1_4_machine_options(m);
     m->hw_version = "1.3.0";
-    pcmc->compat_apic_id_mode = true;
+    x86mc->compat_apic_id_mode = true;
     compat_props_add(m->compat_props, compat, G_N_ELEMENTS(compat));
 }
 
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 8920bd8978..8e7beb9415 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -116,6 +116,7 @@ static void pc_q35_init(MachineState *machine)
 {
     PCMachineState *pcms = PC_MACHINE(machine);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+    X86MachineState *x86ms = X86_MACHINE(machine);
     Q35PCIHost *q35_host;
     PCIHostState *phb;
     PCIBus *host_bus;
@@ -153,27 +154,27 @@ static void pc_q35_init(MachineState *machine)
     /* Handle the machine opt max-ram-below-4g.  It is basically doing
      * min(qemu limit, user limit).
      */
-    if (!pcms->max_ram_below_4g) {
-        pcms->max_ram_below_4g = 1ULL << 32; /* default: 4G */;
+    if (!x86ms->max_ram_below_4g) {
+        x86ms->max_ram_below_4g = 1ULL << 32; /* default: 4G */;
     }
-    if (lowmem > pcms->max_ram_below_4g) {
-        lowmem = pcms->max_ram_below_4g;
+    if (lowmem > x86ms->max_ram_below_4g) {
+        lowmem = x86ms->max_ram_below_4g;
         if (machine->ram_size - lowmem > lowmem &&
             lowmem & (1 * GiB - 1)) {
             warn_report("There is possibly poor performance as the ram size "
                         " (0x%" PRIx64 ") is more then twice the size of"
                         " max-ram-below-4g (%"PRIu64") and"
                         " max-ram-below-4g is not a multiple of 1G.",
-                        (uint64_t)machine->ram_size, pcms->max_ram_below_4g);
+                        (uint64_t)machine->ram_size, x86ms->max_ram_below_4g);
         }
     }
 
     if (machine->ram_size >= lowmem) {
-        pcms->above_4g_mem_size = machine->ram_size - lowmem;
-        pcms->below_4g_mem_size = lowmem;
+        x86ms->above_4g_mem_size = machine->ram_size - lowmem;
+        x86ms->below_4g_mem_size = lowmem;
     } else {
-        pcms->above_4g_mem_size = 0;
-        pcms->below_4g_mem_size = machine->ram_size;
+        x86ms->above_4g_mem_size = 0;
+        x86ms->below_4g_mem_size = machine->ram_size;
     }
 
     if (xen_enabled()) {
@@ -214,10 +215,10 @@ static void pc_q35_init(MachineState *machine)
     gsi_state = g_malloc0(sizeof(*gsi_state));
     if (kvm_ioapic_in_kernel()) {
         kvm_pc_setup_irq_routing(pcmc->pci_enabled);
-        pcms->gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
+        x86ms->gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
                                        GSI_NUM_PINS);
     } else {
-        pcms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
+        x86ms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
     }
 
     /* create pci host bus */
@@ -232,9 +233,9 @@ static void pc_q35_init(MachineState *machine)
                              MCH_HOST_PROP_SYSTEM_MEM, NULL);
     object_property_set_link(OBJECT(q35_host), OBJECT(system_io),
                              MCH_HOST_PROP_IO_MEM, NULL);
-    object_property_set_int(OBJECT(q35_host), pcms->below_4g_mem_size,
+    object_property_set_int(OBJECT(q35_host), x86ms->below_4g_mem_size,
                             PCI_HOST_BELOW_4G_MEM_SIZE, NULL);
-    object_property_set_int(OBJECT(q35_host), pcms->above_4g_mem_size,
+    object_property_set_int(OBJECT(q35_host), x86ms->above_4g_mem_size,
                             PCI_HOST_ABOVE_4G_MEM_SIZE, NULL);
     /* pci */
     qdev_init_nofail(DEVICE(q35_host));
@@ -256,7 +257,7 @@ static void pc_q35_init(MachineState *machine)
     ich9_lpc = ICH9_LPC_DEVICE(lpc);
     lpc_dev = DEVICE(lpc);
     for (i = 0; i < GSI_NUM_PINS; i++) {
-        qdev_connect_gpio_out_named(lpc_dev, ICH9_GPIO_GSI, i, pcms->gsi[i]);
+        qdev_connect_gpio_out_named(lpc_dev, ICH9_GPIO_GSI, i, x86ms->gsi[i]);
     }
     pci_bus_irqs(host_bus, ich9_lpc_set_irq, ich9_lpc_map_irq, ich9_lpc,
                  ICH9_LPC_NB_PIRQS);
@@ -280,7 +281,7 @@ static void pc_q35_init(MachineState *machine)
         ioapic_init_gsi(gsi_state, "q35");
     }
 
-    pc_register_ferr_irq(pcms->gsi[13]);
+    pc_register_ferr_irq(x86ms->gsi[13]);
 
     assert(pcms->vmport != ON_OFF_AUTO__MAX);
     if (pcms->vmport == ON_OFF_AUTO_AUTO) {
@@ -288,7 +289,7 @@ static void pc_q35_init(MachineState *machine)
     }
 
     /* init basic PC hardware */
-    pc_basic_device_init(isa_bus, pcms->gsi, &rtc_state, !mc->no_floppy,
+    pc_basic_device_init(isa_bus, x86ms->gsi, &rtc_state, !mc->no_floppy,
                          (pcms->vmport != ON_OFF_AUTO_ON), pcms->pit_enabled,
                          0xff0104);
 
@@ -331,7 +332,7 @@ static void pc_q35_init(MachineState *machine)
 
     if (machine->nvdimms_state->is_enabled) {
         nvdimm_init_acpi_state(machine->nvdimms_state, system_io,
-                               pcms->fw_cfg, OBJECT(pcms));
+                               x86ms->fw_cfg, OBJECT(pcms));
     }
 }
 
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 6807bb8a22..4a8e254d69 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -64,13 +64,14 @@ uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
                                     unsigned int cpu_index)
 {
     MachineState *ms = MACHINE(pcms);
-    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
+    X86MachineClass *x86mc = X86_MACHINE_GET_CLASS(x86ms);
     uint32_t correct_id;
     static bool warned;
 
-    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
+    correct_id = x86_apicid_from_cpu_idx(x86ms->smp_dies, ms->smp.cores,
                                          ms->smp.threads, cpu_index);
-    if (pcmc->compat_apic_id_mode) {
+    if (x86mc->compat_apic_id_mode) {
         if (cpu_index != correct_id && !warned && !qtest_enabled()) {
             error_report("APIC IDs set in compatibility mode, "
                          "CPU topology won't match the configuration");
@@ -87,11 +88,12 @@ void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
     Object *cpu = NULL;
     Error *local_err = NULL;
     CPUX86State *env = NULL;
+    X86MachineState *x86ms = X86_MACHINE(pcms);
 
     cpu = object_new(MACHINE(pcms)->cpu_type);
 
     env = &X86_CPU(cpu)->env;
-    env->nr_dies = pcms->smp_dies;
+    env->nr_dies = x86ms->smp_dies;
 
     object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
     object_property_set_bool(cpu, true, "realized", &local_err);
@@ -107,6 +109,7 @@ void x86_cpus_init(PCMachineState *pcms)
     MachineState *ms = MACHINE(pcms);
     MachineClass *mc = MACHINE_GET_CLASS(pcms);
     PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
 
     x86_cpu_set_default_version(pcmc->default_cpu_version);
 
@@ -117,8 +120,8 @@ void x86_cpus_init(PCMachineState *pcms)
      *
      * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
      */
-    pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
-                                                     ms->smp.max_cpus - 1) + 1;
+    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
+                                                      ms->smp.max_cpus - 1) + 1;
     possible_cpus = mc->possible_cpu_arch_ids(ms);
     for (i = 0; i < ms->smp.cpus; i++) {
         x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
@@ -138,11 +141,11 @@ x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
 int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
 {
    X86CPUTopoInfo topo;
-   PCMachineState *pcms = PC_MACHINE(ms);
+   X86MachineState *x86ms = X86_MACHINE(ms);
 
    assert(idx < ms->possible_cpus->len);
    x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
-                            pcms->smp_dies, ms->smp.cores,
+                            x86ms->smp_dies, ms->smp.cores,
                             ms->smp.threads, &topo);
    return topo.pkg_id % ms->numa_state->num_nodes;
 }
@@ -150,6 +153,7 @@ int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
 const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
 {
     PCMachineState *pcms = PC_MACHINE(ms);
+    X86MachineState *x86ms = X86_MACHINE(ms);
     int i;
     unsigned int max_cpus = ms->smp.max_cpus;
 
@@ -172,11 +176,11 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
         ms->possible_cpus->cpus[i].vcpus_count = 1;
         ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
         x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
-                                 pcms->smp_dies, ms->smp.cores,
+                                 x86ms->smp_dies, ms->smp.cores,
                                  ms->smp.threads, &topo);
         ms->possible_cpus->cpus[i].props.has_socket_id = true;
         ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
-        if (pcms->smp_dies > 1) {
+        if (x86ms->smp_dies > 1) {
             ms->possible_cpus->cpus[i].props.has_die_id = true;
             ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
         }
@@ -188,6 +192,22 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
     return ms->possible_cpus;
 }
 
+static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
+{
+    /* cpu index isn't used */
+    CPUState *cs;
+
+    CPU_FOREACH(cs) {
+        X86CPU *cpu = X86_CPU(cs);
+
+        if (!cpu->apic_state) {
+            cpu_interrupt(cs, CPU_INTERRUPT_NMI);
+        } else {
+            apic_deliver_nmi(cpu->apic_state);
+        }
+    }
+}
+
 static long get_file_size(FILE *f)
 {
     long where, size;
@@ -324,6 +344,7 @@ void x86_load_linux(PCMachineState *pcms,
     char *vmode;
     MachineState *machine = MACHINE(pcms);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
+    X86MachineState *x86ms = X86_MACHINE(pcms);
     struct setup_data *setup_data;
     const char *kernel_filename = machine->kernel_filename;
     const char *initrd_filename = machine->initrd_filename;
@@ -392,11 +413,11 @@ void x86_load_linux(PCMachineState *pcms,
                             initrd_filename, gerr->message);
                     exit(1);
                 }
-                pcms->initrd_mapped_file = mapped_file;
+                x86ms->initrd_mapped_file = mapped_file;
 
                 initrd_data = g_mapped_file_get_contents(mapped_file);
                 initrd_size = g_mapped_file_get_length(mapped_file);
-                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
+                initrd_max = x86ms->below_4g_mem_size - pcmc->acpi_data_size - 1;
                 if (initrd_size >= initrd_max) {
                     fprintf(stderr, "qemu: initrd is too large, cannot support."
                             "(max: %"PRIu32", need %"PRId64")\n",
@@ -474,8 +495,8 @@ void x86_load_linux(PCMachineState *pcms,
         initrd_max = 0x37ffffff;
     }
 
-    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
-        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
+    if (initrd_max >= x86ms->below_4g_mem_size - pcmc->acpi_data_size) {
+        initrd_max = x86ms->below_4g_mem_size - pcmc->acpi_data_size - 1;
     }
 
     fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
@@ -538,7 +559,7 @@ void x86_load_linux(PCMachineState *pcms,
                     initrd_filename, gerr->message);
             exit(1);
         }
-        pcms->initrd_mapped_file = mapped_file;
+        x86ms->initrd_mapped_file = mapped_file;
 
         initrd_data = g_mapped_file_get_contents(mapped_file);
         initrd_size = g_mapped_file_get_length(mapped_file);
@@ -682,3 +703,91 @@ void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
                                 (uint32_t)(-bios_size),
                                 bios);
 }
+
+static void x86_machine_get_max_ram_below_4g(Object *obj, Visitor *v,
+                                             const char *name, void *opaque,
+                                             Error **errp)
+{
+    X86MachineState *x86ms = X86_MACHINE(obj);
+    uint64_t value = x86ms->max_ram_below_4g;
+
+    visit_type_size(v, name, &value, errp);
+}
+
+static void x86_machine_set_max_ram_below_4g(Object *obj, Visitor *v,
+                                             const char *name, void *opaque,
+                                             Error **errp)
+{
+    X86MachineState *x86ms = X86_MACHINE(obj);
+    Error *error = NULL;
+    uint64_t value;
+
+    visit_type_size(v, name, &value, &error);
+    if (error) {
+        error_propagate(errp, error);
+        return;
+    }
+    if (value > 4 * GiB) {
+        error_setg(&error,
+                   "Machine option 'max-ram-below-4g=%"PRIu64
+                   "' expects size less than or equal to 4G", value);
+        error_propagate(errp, error);
+        return;
+    }
+
+    if (value < 1 * MiB) {
+        warn_report("Only %" PRIu64 " bytes of RAM below the 4GiB boundary,"
+                    "BIOS may not work with less than 1MiB", value);
+    }
+
+    x86ms->max_ram_below_4g = value;
+}
+
+static void x86_machine_initfn(Object *obj)
+{
+    X86MachineState *x86ms = X86_MACHINE(obj);
+
+    x86ms->max_ram_below_4g = 0; /* use default */
+    x86ms->smp_dies = 1;
+}
+
+static void x86_machine_class_init(ObjectClass *oc, void *data)
+{
+    MachineClass *mc = MACHINE_CLASS(oc);
+    X86MachineClass *x86mc = X86_MACHINE_CLASS(oc);
+    NMIClass *nc = NMI_CLASS(oc);
+
+    mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
+    mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
+    mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
+    x86mc->compat_apic_id_mode = false;
+    nc->nmi_monitor_handler = x86_nmi;
+
+    object_class_property_add(oc, X86_MACHINE_MAX_RAM_BELOW_4G, "size",
+        x86_machine_get_max_ram_below_4g, x86_machine_set_max_ram_below_4g,
+        NULL, NULL, &error_abort);
+
+    object_class_property_set_description(oc, X86_MACHINE_MAX_RAM_BELOW_4G,
+        "Maximum ram below the 4G boundary (32bit boundary)", &error_abort);
+}
+
+static const TypeInfo x86_machine_info = {
+    .name = TYPE_X86_MACHINE,
+    .parent = TYPE_MACHINE,
+    .abstract = true,
+    .instance_size = sizeof(X86MachineState),
+    .instance_init = x86_machine_initfn,
+    .class_size = sizeof(X86MachineClass),
+    .class_init = x86_machine_class_init,
+    .interfaces = (InterfaceInfo[]) {
+         { TYPE_NMI },
+         { }
+    },
+};
+
+static void x86_machine_register_types(void)
+{
+    type_register_static(&x86_machine_info);
+}
+
+type_init(x86_machine_register_types)
diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
index 6b5e5bb7f5..f14d7bba4b 100644
--- a/hw/i386/xen/xen-hvm.c
+++ b/hw/i386/xen/xen-hvm.c
@@ -197,10 +197,11 @@ qemu_irq *xen_interrupt_controller_init(void)
 static void xen_ram_init(PCMachineState *pcms,
                          ram_addr_t ram_size, MemoryRegion **ram_memory_p)
 {
+    X86MachineState *x86ms = X86_MACHINE(pcms);
     MemoryRegion *sysmem = get_system_memory();
     ram_addr_t block_len;
     uint64_t user_lowmem = object_property_get_uint(qdev_get_machine(),
-                                                    PC_MACHINE_MAX_RAM_BELOW_4G,
+                                                    X86_MACHINE_MAX_RAM_BELOW_4G,
                                                     &error_abort);
 
     /* Handle the machine opt max-ram-below-4g.  It is basically doing
@@ -214,20 +215,20 @@ static void xen_ram_init(PCMachineState *pcms,
     }
 
     if (ram_size >= user_lowmem) {
-        pcms->above_4g_mem_size = ram_size - user_lowmem;
-        pcms->below_4g_mem_size = user_lowmem;
+        x86ms->above_4g_mem_size = ram_size - user_lowmem;
+        x86ms->below_4g_mem_size = user_lowmem;
     } else {
-        pcms->above_4g_mem_size = 0;
-        pcms->below_4g_mem_size = ram_size;
+        x86ms->above_4g_mem_size = 0;
+        x86ms->below_4g_mem_size = ram_size;
     }
-    if (!pcms->above_4g_mem_size) {
+    if (!x86ms->above_4g_mem_size) {
         block_len = ram_size;
     } else {
         /*
          * Xen does not allocate the memory continuously, it keeps a
          * hole of the size computed above or passed in.
          */
-        block_len = (1ULL << 32) + pcms->above_4g_mem_size;
+        block_len = (1ULL << 32) + x86ms->above_4g_mem_size;
     }
     memory_region_init_ram(&ram_memory, NULL, "xen.ram", block_len,
                            &error_fatal);
@@ -244,12 +245,12 @@ static void xen_ram_init(PCMachineState *pcms,
      */
     memory_region_init_alias(&ram_lo, NULL, "xen.ram.lo",
                              &ram_memory, 0xc0000,
-                             pcms->below_4g_mem_size - 0xc0000);
+                             x86ms->below_4g_mem_size - 0xc0000);
     memory_region_add_subregion(sysmem, 0xc0000, &ram_lo);
-    if (pcms->above_4g_mem_size > 0) {
+    if (x86ms->above_4g_mem_size > 0) {
         memory_region_init_alias(&ram_hi, NULL, "xen.ram.hi",
                                  &ram_memory, 0x100000000ULL,
-                                 pcms->above_4g_mem_size);
+                                 x86ms->above_4g_mem_size);
         memory_region_add_subregion(sysmem, 0x100000000ULL, &ram_hi);
     }
 }
@@ -265,7 +266,7 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
         /* RAM already populated in Xen */
         fprintf(stderr, "%s: do not alloc "RAM_ADDR_FMT
                 " bytes of ram at "RAM_ADDR_FMT" when runstate is INMIGRATE\n",
-                __func__, size, ram_addr); 
+                __func__, size, ram_addr);
         return;
     }
 
diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 1ede055387..ead14e1888 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -89,7 +89,7 @@ static void ioapic_entry_parse(uint64_t entry, struct ioapic_entry_info *info)
 
 static void ioapic_service(IOAPICCommonState *s)
 {
-    AddressSpace *ioapic_as = PC_MACHINE(qdev_get_machine())->ioapic_as;
+    AddressSpace *ioapic_as = X86_MACHINE(qdev_get_machine())->ioapic_as;
     struct ioapic_entry_info info;
     uint8_t i;
     uint32_t mask;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 05/10] hw/i386: make x86.c independent from PCMachineState
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (3 preceding siblings ...)
  2019-10-04  9:37 ` [PATCH v6 04/10] hw/i386: split PCMachineState deriving X86MachineState from it Sergio Lopez
@ 2019-10-04  9:37 ` Sergio Lopez
  2019-10-04  9:51   ` Philippe Mathieu-Daudé
  2019-10-04  9:37 ` [PATCH v6 06/10] fw_cfg: add "modify" functions for all types Sergio Lopez
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

As a last step into splitting PCMachineState and deriving
X86MachineState from it, make the functions previously extracted from
pc.c to x86.c independent from PCMachineState, using X86MachineState
instead.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 include/hw/i386/x86.h | 13 ++++++++----
 hw/i386/pc.c          | 14 ++++++++-----
 hw/i386/pc_piix.c     |  2 +-
 hw/i386/pc_q35.c      |  2 +-
 hw/i386/x86.c         | 49 ++++++++++++++++++++-----------------------
 5 files changed, 43 insertions(+), 37 deletions(-)

diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
index a930a7ad9d..f44359e9e9 100644
--- a/include/hw/i386/x86.h
+++ b/include/hw/i386/x86.h
@@ -73,10 +73,11 @@ typedef struct {
 #define X86_MACHINE_CLASS(class) \
     OBJECT_CLASS_CHECK(X86MachineClass, class, TYPE_X86_MACHINE)
 
-uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
+uint32_t x86_cpu_apic_id_from_index(X86MachineState *pcms,
                                     unsigned int cpu_index);
-void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp);
-void x86_cpus_init(PCMachineState *pcms);
+
+void x86_cpu_new(X86MachineState *pcms, int64_t apic_id, Error **errp);
+void x86_cpus_init(X86MachineState *pcms, int default_cpu_version);
 CpuInstanceProperties x86_cpu_index_to_props(MachineState *ms,
                                              unsigned cpu_index);
 int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx);
@@ -84,6 +85,10 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
 
 void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw);
 
-void x86_load_linux(PCMachineState *pcms, FWCfgState *fw_cfg);
+void x86_load_linux(X86MachineState *x86ms,
+                    FWCfgState *fw_cfg,
+                    int acpi_data_size,
+                    bool pvh_enabled,
+                    bool linuxboot_dma_enabled);
 
 #endif
diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 0dc1420a1f..0bf93d489c 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -982,8 +982,8 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
 
 void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
 {
-    PCMachineState *pcms = PC_MACHINE(ms);
-    int64_t apic_id = x86_cpu_apic_id_from_index(pcms, id);
+    X86MachineState *x86ms = X86_MACHINE(ms);
+    int64_t apic_id = x86_cpu_apic_id_from_index(x86ms, id);
     Error *local_err = NULL;
 
     if (id < 0) {
@@ -998,7 +998,8 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
         return;
     }
 
-    x86_cpu_new(PC_MACHINE(ms), apic_id, &local_err);
+
+    x86_cpu_new(X86_MACHINE(ms), apic_id, &local_err);
     if (local_err) {
         error_propagate(errp, local_err);
         return;
@@ -1099,6 +1100,7 @@ void xen_load_linux(PCMachineState *pcms)
 {
     int i;
     FWCfgState *fw_cfg;
+    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
     X86MachineState *x86ms = X86_MACHINE(pcms);
 
     assert(MACHINE(pcms)->kernel_filename != NULL);
@@ -1107,7 +1109,8 @@ void xen_load_linux(PCMachineState *pcms)
     fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
     rom_set_fw(fw_cfg);
 
-    x86_load_linux(pcms, fw_cfg);
+    x86_load_linux(x86ms, fw_cfg, pcmc->acpi_data_size,
+                   pcmc->pvh_enabled, pcmc->linuxboot_dma_enabled);
     for (i = 0; i < nb_option_roms; i++) {
         assert(!strcmp(option_rom[i].name, "linuxboot.bin") ||
                !strcmp(option_rom[i].name, "linuxboot_dma.bin") ||
@@ -1243,7 +1246,8 @@ void pc_memory_init(PCMachineState *pcms,
     }
 
     if (linux_boot) {
-        x86_load_linux(pcms, fw_cfg);
+        x86_load_linux(x86ms, fw_cfg, pcmc->acpi_data_size,
+                       pcmc->pvh_enabled, pcmc->linuxboot_dma_enabled);
     }
 
     for (i = 0; i < nb_option_roms; i++) {
diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
index 0afa8fe6ea..a86317cdff 100644
--- a/hw/i386/pc_piix.c
+++ b/hw/i386/pc_piix.c
@@ -154,7 +154,7 @@ static void pc_init1(MachineState *machine,
         }
     }
 
-    x86_cpus_init(pcms);
+    x86_cpus_init(x86ms, pcmc->default_cpu_version);
 
     if (kvm_enabled() && pcmc->kvmclock_enabled) {
         kvmclock_create();
diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 8e7beb9415..8bdca373d6 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -181,7 +181,7 @@ static void pc_q35_init(MachineState *machine)
         xen_hvm_init(pcms, &ram_memory);
     }
 
-    x86_cpus_init(pcms);
+    x86_cpus_init(x86ms, pcmc->default_cpu_version);
 
     kvmclock_create();
 
diff --git a/hw/i386/x86.c b/hw/i386/x86.c
index 4a8e254d69..55944a9a02 100644
--- a/hw/i386/x86.c
+++ b/hw/i386/x86.c
@@ -60,11 +60,10 @@ static size_t pvh_start_addr;
  * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
  * all CPUs up to max_cpus.
  */
-uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
+uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
                                     unsigned int cpu_index)
 {
-    MachineState *ms = MACHINE(pcms);
-    X86MachineState *x86ms = X86_MACHINE(pcms);
+    MachineState *ms = MACHINE(x86ms);
     X86MachineClass *x86mc = X86_MACHINE_GET_CLASS(x86ms);
     uint32_t correct_id;
     static bool warned;
@@ -83,14 +82,14 @@ uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
     }
 }
 
-void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
+
+void x86_cpu_new(X86MachineState *x86ms, int64_t apic_id, Error **errp)
 {
     Object *cpu = NULL;
     Error *local_err = NULL;
     CPUX86State *env = NULL;
-    X86MachineState *x86ms = X86_MACHINE(pcms);
 
-    cpu = object_new(MACHINE(pcms)->cpu_type);
+    cpu = object_new(MACHINE(x86ms)->cpu_type);
 
     env = &X86_CPU(cpu)->env;
     env->nr_dies = x86ms->smp_dies;
@@ -102,16 +101,14 @@ void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
     error_propagate(errp, local_err);
 }
 
-void x86_cpus_init(PCMachineState *pcms)
+void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
 {
     int i;
     const CPUArchIdList *possible_cpus;
-    MachineState *ms = MACHINE(pcms);
-    MachineClass *mc = MACHINE_GET_CLASS(pcms);
-    PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
-    X86MachineState *x86ms = X86_MACHINE(pcms);
+    MachineState *ms = MACHINE(x86ms);
+    MachineClass *mc = MACHINE_GET_CLASS(x86ms);
 
-    x86_cpu_set_default_version(pcmc->default_cpu_version);
+    x86_cpu_set_default_version(default_cpu_version);
 
     /* Calculates the limit to CPU APIC ID values
      *
@@ -120,11 +117,11 @@ void x86_cpus_init(PCMachineState *pcms)
      *
      * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
      */
-    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
+    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,
                                                       ms->smp.max_cpus - 1) + 1;
     possible_cpus = mc->possible_cpu_arch_ids(ms);
     for (i = 0; i < ms->smp.cpus; i++) {
-        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
+        x86_cpu_new(x86ms, possible_cpus->cpus[i].arch_id, &error_fatal);
     }
 }
 
@@ -152,7 +149,6 @@ int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
 
 const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
 {
-    PCMachineState *pcms = PC_MACHINE(ms);
     X86MachineState *x86ms = X86_MACHINE(ms);
     int i;
     unsigned int max_cpus = ms->smp.max_cpus;
@@ -174,7 +170,7 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
 
         ms->possible_cpus->cpus[i].type = ms->cpu_type;
         ms->possible_cpus->cpus[i].vcpus_count = 1;
-        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
+        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(x86ms, i);
         x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
                                  x86ms->smp_dies, ms->smp.cores,
                                  ms->smp.threads, &topo);
@@ -331,8 +327,11 @@ static bool load_elfboot(const char *kernel_filename,
     return true;
 }
 
-void x86_load_linux(PCMachineState *pcms,
-                    FWCfgState *fw_cfg)
+void x86_load_linux(X86MachineState *x86ms,
+                    FWCfgState *fw_cfg,
+                    int acpi_data_size,
+                    bool pvh_enabled,
+                    bool linuxboot_dma_enabled)
 {
     uint16_t protocol;
     int setup_size, kernel_size, cmdline_size;
@@ -342,9 +341,7 @@ void x86_load_linux(PCMachineState *pcms,
     hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
     FILE *f;
     char *vmode;
-    MachineState *machine = MACHINE(pcms);
-    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
-    X86MachineState *x86ms = X86_MACHINE(pcms);
+    MachineState *machine = MACHINE(x86ms);
     struct setup_data *setup_data;
     const char *kernel_filename = machine->kernel_filename;
     const char *initrd_filename = machine->initrd_filename;
@@ -387,7 +384,7 @@ void x86_load_linux(PCMachineState *pcms,
          * saving the PVH entry point used by the x86/HVM direct boot ABI.
          * If load_elfboot() is successful, populate the fw_cfg info.
          */
-        if (pcmc->pvh_enabled &&
+        if (pvh_enabled &&
             load_elfboot(kernel_filename, kernel_size,
                          header, pvh_start_addr, fw_cfg)) {
             fclose(f);
@@ -417,7 +414,7 @@ void x86_load_linux(PCMachineState *pcms,
 
                 initrd_data = g_mapped_file_get_contents(mapped_file);
                 initrd_size = g_mapped_file_get_length(mapped_file);
-                initrd_max = x86ms->below_4g_mem_size - pcmc->acpi_data_size - 1;
+                initrd_max = x86ms->below_4g_mem_size - acpi_data_size - 1;
                 if (initrd_size >= initrd_max) {
                     fprintf(stderr, "qemu: initrd is too large, cannot support."
                             "(max: %"PRIu32", need %"PRId64")\n",
@@ -495,8 +492,8 @@ void x86_load_linux(PCMachineState *pcms,
         initrd_max = 0x37ffffff;
     }
 
-    if (initrd_max >= x86ms->below_4g_mem_size - pcmc->acpi_data_size) {
-        initrd_max = x86ms->below_4g_mem_size - pcmc->acpi_data_size - 1;
+    if (initrd_max >= x86ms->below_4g_mem_size - acpi_data_size) {
+        initrd_max = x86ms->below_4g_mem_size - acpi_data_size - 1;
     }
 
     fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
@@ -645,7 +642,7 @@ void x86_load_linux(PCMachineState *pcms,
 
     option_rom[nb_option_roms].bootindex = 0;
     option_rom[nb_option_roms].name = "linuxboot.bin";
-    if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
+    if (linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
         option_rom[nb_option_roms].name = "linuxboot_dma.bin";
     }
     nb_option_roms++;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 06/10] fw_cfg: add "modify" functions for all types
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (4 preceding siblings ...)
  2019-10-04  9:37 ` [PATCH v6 05/10] hw/i386: make x86.c independent from PCMachineState Sergio Lopez
@ 2019-10-04  9:37 ` Sergio Lopez
  2019-10-04  9:37 ` [PATCH v6 07/10] hw/intc/apic: reject pic ints if isa_pic == NULL Sergio Lopez
                   ` (7 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

This allows to alter the contents of an already added item.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 include/hw/nvram/fw_cfg.h | 42 +++++++++++++++++++++++++++++++++++++++
 hw/nvram/fw_cfg.c         | 29 +++++++++++++++++++++++++++
 2 files changed, 71 insertions(+)

diff --git a/include/hw/nvram/fw_cfg.h b/include/hw/nvram/fw_cfg.h
index 80e435d303..b5291eefad 100644
--- a/include/hw/nvram/fw_cfg.h
+++ b/include/hw/nvram/fw_cfg.h
@@ -98,6 +98,20 @@ void fw_cfg_add_bytes(FWCfgState *s, uint16_t key, void *data, size_t len);
  */
 void fw_cfg_add_string(FWCfgState *s, uint16_t key, const char *value);
 
+/**
+ * fw_cfg_modify_string:
+ * @s: fw_cfg device being modified
+ * @key: selector key value for new fw_cfg item
+ * @value: NUL-terminated ascii string
+ *
+ * Replace the fw_cfg item available by selecting the given key. The new
+ * data will consist of a dynamically allocated copy of the provided string,
+ * including its NUL terminator. The data being replaced, assumed to have
+ * been dynamically allocated during an earlier call to either
+ * fw_cfg_add_string() or fw_cfg_modify_string(), is freed before returning.
+ */
+void fw_cfg_modify_string(FWCfgState *s, uint16_t key, const char *value);
+
 /**
  * fw_cfg_add_i16:
  * @s: fw_cfg device being modified
@@ -136,6 +150,20 @@ void fw_cfg_modify_i16(FWCfgState *s, uint16_t key, uint16_t value);
  */
 void fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t value);
 
+/**
+ * fw_cfg_modify_i32:
+ * @s: fw_cfg device being modified
+ * @key: selector key value for new fw_cfg item
+ * @value: 32-bit integer
+ *
+ * Replace the fw_cfg item available by selecting the given key. The new
+ * data will consist of a dynamically allocated copy of the given 32-bit
+ * value, converted to little-endian representation. The data being replaced,
+ * assumed to have been dynamically allocated during an earlier call to
+ * either fw_cfg_add_i32() or fw_cfg_modify_i32(), is freed before returning.
+ */
+void fw_cfg_modify_i32(FWCfgState *s, uint16_t key, uint32_t value);
+
 /**
  * fw_cfg_add_i64:
  * @s: fw_cfg device being modified
@@ -148,6 +176,20 @@ void fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t value);
  */
 void fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value);
 
+/**
+ * fw_cfg_modify_i64:
+ * @s: fw_cfg device being modified
+ * @key: selector key value for new fw_cfg item
+ * @value: 64-bit integer
+ *
+ * Replace the fw_cfg item available by selecting the given key. The new
+ * data will consist of a dynamically allocated copy of the given 64-bit
+ * value, converted to little-endian representation. The data being replaced,
+ * assumed to have been dynamically allocated during an earlier call to
+ * either fw_cfg_add_i64() or fw_cfg_modify_i64(), is freed before returning.
+ */
+void fw_cfg_modify_i64(FWCfgState *s, uint16_t key, uint64_t value);
+
 /**
  * fw_cfg_add_file:
  * @s: fw_cfg device being modified
diff --git a/hw/nvram/fw_cfg.c b/hw/nvram/fw_cfg.c
index 7dc3ac378e..aef1727250 100644
--- a/hw/nvram/fw_cfg.c
+++ b/hw/nvram/fw_cfg.c
@@ -690,6 +690,15 @@ void fw_cfg_add_string(FWCfgState *s, uint16_t key, const char *value)
     fw_cfg_add_bytes(s, key, g_memdup(value, sz), sz);
 }
 
+void fw_cfg_modify_string(FWCfgState *s, uint16_t key, const char *value)
+{
+    size_t sz = strlen(value) + 1;
+    char *old;
+
+    old = fw_cfg_modify_bytes_read(s, key, g_memdup(value, sz), sz);
+    g_free(old);
+}
+
 void fw_cfg_add_i16(FWCfgState *s, uint16_t key, uint16_t value)
 {
     uint16_t *copy;
@@ -720,6 +729,16 @@ void fw_cfg_add_i32(FWCfgState *s, uint16_t key, uint32_t value)
     fw_cfg_add_bytes(s, key, copy, sizeof(value));
 }
 
+void fw_cfg_modify_i32(FWCfgState *s, uint16_t key, uint32_t value)
+{
+    uint32_t *copy, *old;
+
+    copy = g_malloc(sizeof(value));
+    *copy = cpu_to_le32(value);
+    old = fw_cfg_modify_bytes_read(s, key, copy, sizeof(value));
+    g_free(old);
+}
+
 void fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value)
 {
     uint64_t *copy;
@@ -730,6 +749,16 @@ void fw_cfg_add_i64(FWCfgState *s, uint16_t key, uint64_t value)
     fw_cfg_add_bytes(s, key, copy, sizeof(value));
 }
 
+void fw_cfg_modify_i64(FWCfgState *s, uint16_t key, uint64_t value)
+{
+    uint64_t *copy, *old;
+
+    copy = g_malloc(sizeof(value));
+    *copy = cpu_to_le64(value);
+    old = fw_cfg_modify_bytes_read(s, key, copy, sizeof(value));
+    g_free(old);
+}
+
 void fw_cfg_set_order_override(FWCfgState *s, int order)
 {
     assert(s->fw_cfg_order_override == 0);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 07/10] hw/intc/apic: reject pic ints if isa_pic == NULL
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (5 preceding siblings ...)
  2019-10-04  9:37 ` [PATCH v6 06/10] fw_cfg: add "modify" functions for all types Sergio Lopez
@ 2019-10-04  9:37 ` Sergio Lopez
  2019-10-04  9:37 ` [PATCH v6 08/10] roms: add microvm-bios (qboot) as binary and git submodule Sergio Lopez
                   ` (6 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

In apic_accept_pic_intr(), reject PIC interruptions if a i8259 PIC has
not been instantiated (isa_pic == NULL).

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 hw/intc/apic.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/intc/apic.c b/hw/intc/apic.c
index bce89911dc..2a74f7b4bf 100644
--- a/hw/intc/apic.c
+++ b/hw/intc/apic.c
@@ -610,7 +610,7 @@ int apic_accept_pic_intr(DeviceState *dev)
 
     if ((s->apicbase & MSR_IA32_APICBASE_ENABLE) == 0 ||
         (lvt0 & APIC_LVT_MASKED) == 0)
-        return 1;
+        return isa_pic != NULL;
 
     return 0;
 }
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 08/10] roms: add microvm-bios (qboot) as binary and git submodule
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (6 preceding siblings ...)
  2019-10-04  9:37 ` [PATCH v6 07/10] hw/intc/apic: reject pic ints if isa_pic == NULL Sergio Lopez
@ 2019-10-04  9:37 ` Sergio Lopez
  2019-10-04 11:50   ` Stefano Garzarella
  2019-10-04  9:37 ` [PATCH v6 09/10] docs/microvm.rst: document the new microvm machine type Sergio Lopez
                   ` (5 subsequent siblings)
  13 siblings, 1 reply; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

qboot is a minimalist x86 firmware for booting Linux kernels. It does
the mininum amount of work required for the task, and it's able to
boot both PVH images and bzImages without relying on option roms.

This characteristics make it an ideal companion for the microvm
machine type.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 .gitmodules              |   3 +++
 pc-bios/bios-microvm.bin | Bin 0 -> 65536 bytes
 roms/Makefile            |   6 ++++++
 roms/qboot               |   1 +
 4 files changed, 10 insertions(+)
 create mode 100755 pc-bios/bios-microvm.bin
 create mode 160000 roms/qboot

diff --git a/.gitmodules b/.gitmodules
index c5c474169d..19792c9a11 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -58,3 +58,6 @@
 [submodule "roms/opensbi"]
 	path = roms/opensbi
 	url = 	https://git.qemu.org/git/opensbi.git
+[submodule "roms/qboot"]
+	path = roms/qboot
+	url = https://github.com/bonzini/qboot
diff --git a/pc-bios/bios-microvm.bin b/pc-bios/bios-microvm.bin
new file mode 100755
index 0000000000000000000000000000000000000000..45eabc516692e2d134bbb630d133c7c2dcc9a9b6
GIT binary patch
literal 65536
zcmeI2eS8zwneS)hF_vth5y2#;brv;O^x_6mtAUMO%tn5}IEf)jY``G|XbMBqa_^-@
zBBUXSyplo3+VMx9Ci}VFwz~@rx!t?By&v!0UdNb=V<BQ0NE-8!l)yH1LPAE!VDnNi
zCSmUHnUQRVmuC0>GyaU`%sJ0_&U3!!Ij`fT?33lo8bTirhX$S6(dmK^Nu2qLAPZJO
zd+n=&@XLYwWFdS~Zh2c2gidFUEU<VyB{hF24C^}E=jl<HQ(<+MP>-}gkOUzx)FqJ6
z;W433mmmmAY+Noh;tibd?18@VxCH}v4GeX<kXL!tiyU!Hnn`6SuU6r$bB&SE_=SXJ
zc+)=1$Iq|sgvbt9s)=?%V2RL(E{A7Ar52#~B&%`TJKuimt+%dx%KD*M-I%M^1gEP~
z3rrTeoTTV}=y-L<H(`6Jx<%^VI8zpO=OW?aYwDJw?(oFd+1)>#`0DNc_4sRW0TC1A
zmJsrGlX|tHBmSvHZ7h?b^=>;mu1$tb(P>mvG{5}(=7p;C?X<Nv)KiF;vS^dFph*f0
z|KM_=_+GTu{^~BsC2OI`iH8;Xgy<4`bd|F?E`U$ysLqzy*(zuJjHIw>{{UgF+=c4I
zP{`<|^mTPhV|UNEdEHc{)HAxSYaiW>P|0;&C$X5aR9UVpQyP@Vl`hhv$gdx{Z+-NR
zwvW~;(eFuX@w<QuU$3wVw^IK3ra9{s)&CnoiJ!J0%~y=q9~UYm@2P&Y>tUs8?fTZ0
z-^y7ZnXZ(N2F|Wml7g>taRZ)SYa#R~3)d_2XS+8|g~IPiN|W-WvPxO4Jf$R*7zq(M
zVPZ7&u2-7NT$&*G^Vf&!N<}4sxYR-&zO@e~z~qj%l+aa&|1SJK8kh_<mPg?Ph8*ct
zx<+yY;hYhS;T3-|U%TtHtLYu}w_i63jI9sWyCrer`&zejU6FSvla#+O5G_?20qHTI
z@+oXU(Ov{h!_X&`70OD~fa)<r$y4N=@1QQhFU$Y+KbwE#7xIotf3bYo(#FRhYw)oF
z?fGIsXnOLA6)T@wwR%RLyz}o5Wo%bFs0P1iIT@I~!0oGks67~Pcuwvn$LZRE?$a*(
z{h{{eadET88UP82Cf}S~Nfy9}`gHKyg1<oBEO>(z79nsokwAD^1AC7pf@Oj~FIF9#
zF9b$C2U-gqk-~z?@R7iuo?L~z4ZVWkIiSQ^i}NGJ*2?fn#8cjeR;%3om9j(r$>}0*
zmHG01U~>3C;Jl}&<hUUtcgHCdOXn#uZ}@;euhm+1IPj;0rza63dsZ*SFvX5~bm(K(
z7X>R|!j%@?|5s#Espmjtu)+#k%inqSKe2u4Y^$dyZBy&S@?QTm*4Jv!DYIMrLsd*G
z=`T-i{>XEL^*_04^-~G9FB2d;TMqbVppb+**NV)Wh3cyEi~fAMBS-GYFX{6SqmnR3
zi7j8o-YehtBPZl-;$fEDRenBjZPn_8`k0QW%duAboe{h9;n1k=PmPBITKXgysm0cG
zff9plWlq19^_3qFT=eu943;%0F}d3r9Ci6~gQK=UjyF9V9G&A|6db+Rf1l@a_=w-<
zIgp;B+L?Gjt)J5Gg)}0q>Wcp0HQVM-TQ2)8SKeoRFckXbZl8$MyIG&-ayp6njK|qn
zUpF;;x*hu7L6uw4aN!=m!{{NE=USzLa8KY0G!Yj<9~vxX3HENZ(Onw#yXUrCeraCo
zVgU^`gK6GqgVb`wZ;#Gbmy2v_MAjdX^lEFX6)k+K<#J$AXn(Om8@eKmZkcJ?-+r#^
zCA}?|Tk-m02jZiTNML7==D<i+5OJC+mIOXEN(uZdy$8OjX-^@a*!$f7jzz1bmLC6P
z$oo)aui!E>)CNvx3sYjI-Lq-*XZ<Zl<Oq$)Z{QZ>a2UU;-+c$WsKoy6%2oD<>eine
z!#Ejf&|7)}XW?0sdUNKePg&pstJ#T?3+&za*%{)yhd+P##k*mz+)*)!?YnZKwSKA|
zu}@-GBM5lQwXLUnbA-^KRrkB=>1G$AHSS_<H^$k}{tAOa?8@;s2!3iSdOq3ETjRc?
zR9)$wnodlx*{W`JPX*Q&*ij0x>ujsr9%IKnN3NV_f2z%xyG&<)WQp>tn@_V;6awYf
z{q=0PWP~N+=^0|;@HO^_x)<+6Z;;S0r?eL5MGEsG)4dPD(&64oU$Ar(mKJI91WQp*
z*fzBkMi!<IXKbrL`}@Gu{V;2E&q~&~XA{xZDd8h>_#i^;Q2JgahPiGQ8gvStZuJR~
zt#rF1Q*=b?$N+zvO5*<=;%c=B9T?OE0Z$h_QD(6#I65=Xr9N+IZ4eRkS9#6`N53nF
zgGh%{`-7vUa`;ocs8#P&SmkZ6Ac(#qhoeSdgWU2t0t?xQx<zb7mzo*qiEe;NXZ?4e
znr`x%Mz_0Hm^o(CHN&Qs4b&*`M+jj!!(+BL+i)zAszGW@t(BrHq3fi7YSe)q;Z|3e
z70e$~*2}J?iVWFD_25hT9RCm4JD#4_8EQj6M$u8*vvNr?R^2MiP<{P)-F%7lF{}=>
ztJi2*MGFO0nhpzHsS48`Vp6I$SzypCIQ|r)66h+xi_VhantBb5r^GsquKknX=-R+M
zR5)+3-109(YL#R<<}5hotld~R3DG;%8uwi7=x5}ePWAz;ei|y+wczL$|D53HM*s5Q
z=%zrc`r^c_nOvX1R?2lfbszGepq#~lx-UxZrlnOz{2p{&Q(UKz(M1ePy8b6tOokmV
zLn9Qdcp}|RixRp+gLWhp<KZKhBHF0BOWSB@D^4^t4<64$-NNpV@i4^x#yyN+S3gD9
zxSiTUh{EF#MDZBprm6LA(HQ8^&gUvyD|KV6JKnsXG+_>+axvgJlad?eTee+@?{jXu
zfYy+<4a9q#+Xg%bFc#Lkt?+6)8^}a<6=7{Pl(xY4C3hw+G;lp|9;H5+Abs%xXQ(ef
zlf)pDipQPXQqUx2%7Egdw^J<MQg3E#-s42!M!T~U%24+dys-;|a>oenzQOp1R=J}G
z#oOYxtAp`@B3qnI9t8DHEcgY=VF>p@4)U7qZI&7f#y_FB9^0F2j)oj89X~b6STSQI
zJGB67=8qA3IhV9q{PWleZJ*%`FMcgLjZ#$m)L7J#t+PdmS4q~;K5zNqKq{1(6=GPR
zzt1jQ+&}|ddcP9G9Na5+7s=gK*0Mxk2BxTKvC!2A{4?b4l@75|?ykwFgh^OnXr7ZL
z33i+&JdDZR31b~%GO@GXorc9VZcG&~Hp$(SA~o)u=mIh;l(c%zVi2>ct4HK+8P1VY
z))%OjyZbRz-UVHukq7*D$=!{UL`<^vIo7g+DDMc$J5q8GgR;CZl=RNu;Fbd2d(lC4
zriJ#~jnN$*YE7SvVGH7y;{%g&a;dneK=#pEW<1I(by1@!Ly6_^4fgx76!~?plm%+n
z5{9EYS7Y6gk?*3`y|@8xwK@?azsl16k9t%NY`T@Nl08`i3bYI6;DEBwPKPHJZrYud
zoU9dPO@-cD*-GuwJh;JyN!V~#+6S;vWoVD#t|#DT!#BC>`HZ{PyBj;<ZH7GiGSQWt
zv{53}H;QWUPn@=t?ff8nGyX}D?d{IHZX=lKEx$90?-kE6zq?5e|APg+tSFNuQ*pWF
z6~tnSUb9<pQExIFxfnxFh1P|Z3irq@wAgBf#7hh7Yvu43$afnA3^OlGByigfD~F%Q
zo~RQe6nb{1V%i`)+74#uIPp)jeLQI!GOSK!4GjMya<b9RDafmB!We7f&*I!`;6DR3
z$8W;_zKLHBU&qd=lco%Vswtc)+>gRfXAAGOT^wY+@zX`N53<F#_{i`(vU#ttGbSPI
zYF`40owUQ9L)(<Nmd|Rf$y#GUk=Y@C1M^Bvw6-*~Pcqj2NrAj3njc*uu{w!0S)&hI
zqbuyJ&d!>g(TAO^u5eMPrzo_quzV<RM3o*Clkrb;*o(6}X)2oNm8k0k*vI3ioVNGb
zA=|aVNId>wk?s9XOZ#u`Wf%1iwZW^pj&5Bay_-h4&?!svD7EzF;^s5-#C%kdOM#Y?
za@Yk<+8fAV&^B6`nhQ&_8YH*uMTN7?oyxijCbmWhc*lVK22!2VV6WNICN4;=#ENPf
zB<oC?h8s-qwi<X_^M2}I<~p;BjDjDo(e8smsqdqE#*(=LaBq3@zuuqBl@Jn9N;21q
z5M9Y!`&ejhCS-c5(O3U{o@&kx=6dE2<auvqDn`p1Is6S&j_ot5r=1?b&^f0FB_(r<
zGv)0izcf6QX@7on=(F{j-P8Y^DQfg~?ItHqWqbP=;3KH^xJM4L^F~u_a2IEnpy=*y
zgl7>HT7SIiG;A!)*l2Phb~}x8987b92)D_ZSoaU%3bdVOzo&{{{O9OoL)L>`+Wn!p
z#QN3ZzZ5U3#V4ZW(Pt!9hI5eUbJplHcDBXJzGsg<=VV))p?D$OxjYd#!KTeJY?z&~
zfz2yL=s!A+^!Xn!dcxQkGNBI)$4@(L=h^RU*E4<K{2p=^qa&ajd~zB7!&2FS<QrGe
zD>9yoIC59*PBRXEDu<teyOFOVjrzVeitMRI^1%u72&Vkv)M3mpMt7O$?y2P%sIZnD
z>joIhO<zwPgc5ea)Vhnb{vTR5;#}Z7J7dPSQS59iL}EC8Y&+i8(>Bniwe>5W9{HXg
z<;*qm=Ha%JZHcy1ZRxh7>{uw_YVawZlM2o+JVz9du6WKKYVQ0xe6?=vj~e8b9^|0;
z>oFzzQ}W>GJQ}vt7j=f?u;y#JC~w~hkuG6}O}N)r6#8fwn`OJZ868i5U~}$ndO*%%
zk;7}@H#p&{dGJ+*`D(ib^ude<8&a@8wFyz53;WeD?Nv(-XAWBru?UW11Qzq+SMG>S
z!-IrAfROsI82`}5>YUNK;*!G{16h+D4z&?}B(;kxjNw3u(fZ?Za31oBflXhq)U<5d
zBe2r2$2211gXny_dvgwI!eaG!>kZDuZ#(1#i@a?q<bx4X7{pMBhr(iYc$nt5R3h1P
zz;iNHg#g+IxkSsem0u`%jM1|XeSwM71d(n-zTE;Qo-T|eLTWlR#SClx-`tC}xd3ww
zQ4#3znpy+_19rN7ye6p!5j$2=rZsMv{Z7)4W>z=4cjF9Ph?ZS^Eox<l;%O=BlS&ST
z&N$Buz<Y{}!&RIgMklYqsX<Ac^<6pfwf<M>CBfo5*W#(xICY8?vDl{f1nn!9n&<6<
zv)Q`-&R=6X(U$xP-rTDlqDLY$5G>h`BRqR_UB9~Sk~xlJb^H+<*4~9gJe}wNn3fjt
zQ+evv;tKY{TPzW+-X9vV_&3Sh$rrFY+}BzyMHU{zc}#VmcJ7j){|BT$9=eW`@gUrc
zjfQmRrrgYU6@1~Cg)KfNho>M(n*|6s7?b|57cL%Md<<njf!v{VaQ;(o$>Hx1Orywv
z+FD&f1wCJ;ZtBB^;IC0uw6KNpv88?ForP4^u=dP&0>xg3o#JS6;Q*dOa^c7HkUJJ0
zQ(qXsF=PDVu4tgoql<xoDKC7?ey;`wC@%Fp3J#6r$Su?_qK{bF;y#=_v8BhN_fti4
zsf-Gv@D|kaXAsf?h%4`vLn`H1daQh(y#22cEa{If?kkT!`yA^RM=+`W7qrphW9axX
zuiR0miwEZ~JQlb#tMGwT9?I12r9E+S0+4;EQY}>PrUhKrUNj>c(e|dX3&vva7^I^n
zyjNWyk<4}Dm#!2Y1EC)tN$=8zFo?l_XiTBCzjdad>-cnmo~(X9uX_Z?_c7Kt^E2eK
z9iNQfAiFRmcN}g$_!*Mg?>|>r{&wI4Hgbam8<9KuGcu*;GX0*)*i9Nc4Lt{k_K*&C
z;_QNL!3XY;!)~+@f;}aE8%`qWSX7C<v1SBCj<AD6?fLA)IU}_9b1FKYoiSg#uhVM*
zyI{V)p_{Lx=Q_Vfc{4dDyqp<Np<H7r1q^w-W}Z%Q5}7vCS(DKbMCrR@FnmW06mf17
znMQ$D<F~9|4xd9BDu&NB?T9alquWz!UEI|f)DcPJ10sXu(}--Ja@uuzN4xd(*V;Fp
z-q|kUX~9!G{Wbhv!<&A3r>Cc@AEV8=c6--h<0+c`uzQUa^%_Ra-Y{0eBLs$Cy9f*$
zF<;E}3$dX^Nk)L!Fb%aVFvyJ6GHAZHi%``W%O+xxPYC)mi8iR&{RFC+8ZrEb9rW}V
z`@)ojTG4rQ!=CZxJu$S~Xx@X=ykl)j`_Jw*<VM3#?a_?}?OaO{Cx(m{BY}<(+NtTN
z%5a11nO|@2-vO?k%zXt9y5x@BpCO1Xz-;;TdA~pqOD0$A<eySxy?YKk8r$nxxUtcp
zF#zXjSWp?~d^pScDjo7o*Y}dy3Uc0uMYIz#cF0H&7rLnqIAycPPgz-}1DwGz1#Q|3
zy-H|Al(+4|RD(1K2HBaRE<#QlqOTyXg5t2fr+8`{S`&s3$Sozvav36`{~nY`$!{1f
zHL9Ft@k|Ws3-wuyZO-w@=-;G!w;^;;IjbxXMgMwbfix*l(R>Iy!5}-1betc|rhY(S
zEcgzZA){m@O+SYpJ1m=y!_Q)>y%^exYRcjNftpg#Du;iCAd)5A_Ee71uBZn#EmVHz
zFd@7!gcwXdM$9EzT^ccl6_D9DCIiZbjNg<&y$azA&W-*&Jf{3G(A<2;n>Bs6fUbel
zN7_t~`36STrwo}Uwmbz>;5)&sZhRJwtiW3=gtoF$gO#IE<(5ivO04wmgUnZPw7v&5
zp<4fvTn2~5#cUpVP3ztNgx{9S=Ey8})VrtW;6#u0!Br+3b$GLj+#%CogsVDY<cZu(
zuEucM!ZcGPa@=5hF8jWh+NWK|$P`cIutT9!GsHhCYyr8Y_H7zyqei$S8)@2tDo9y<
z$ht&sQBb|?X)n(4Ob@}?813QTgdW7$<62H=U<OQgp}unj|3bCis)s(b$bTCT#q%?H
zA^>_W^%G<Py9~*x;B*7JNf1c&ARgyMMK5)U@icu4!I_TLn3#=~y7n6^s*}R%`@gjs
zpGEuWLpZ*n3ZmMSpY+l=g!gI;Q`l<Y+uJwtBX#*z<wpaTL;I~b%f!ei1T$QzOPOQ7
zzt|WX0<(K_sl(E_sdk5P&SOtE$IM8PB0tB<I<((zd@-iw3(n>q$AZJ!d^I0O%5r!e
zeD2vz%~!3)+jhJ)U$r3|crd!LIO>(gCbkRB9~pC5zcH7Y2C?DLfMv2VSnX5J%CGmZ
zYHR6PR?vh~yymN-p>hZ8SDUXdp_yOY{7~uH;M>itAib#hs@+KcHPV}}PNpK~Pbe?i
zP<)F5qt(_fDsCwKQ&O~z6@(s{Et<s@kcu7u9u;y2lpE9Vxzu9n4@q+u?UkGuU9^Ax
zNY~|#AUQkgldQJmTdXTukh%w=@To7y0xQ{t&_@=t3}q;#!9l*%Fr#8;QKGTPh<&NN
z+)-VeVUNL)(5HeZht>*ccpM3QW&>K|NMI}^Ahde1Z#9k%5OU&HOhHlFOKF}Se7{^O
zi2VzjKd02d`B}O7cy<-y$N1ntyWxxfE165FzUsxk-rPG8l9IVacsy41=%Y9XK&a8>
za&@{8=>8ZB=iL~_@LwDkUrXlBMVufET<^qD$Bm~s6Hc$jcCsG~GMxVu87JfNFKt(_
zTa()!L^E~!pvj}tbOTK^xL9$fKf>N+e~hWi29vcT6rHRkrxki@2ZczL*jr^O!~!qe
z2siJo$`RR*CH8%*y5y|qi05QF7j>UfDIP8VUEs2~(k*#Iy@lnHJMb^~?qrQhZnZcV
zKV?z(RG?t>Uu*CiO4KXKJW)t4vOLsK=~9x5Cb?Yrgo?vy!(beIOLDiRN{pBPK5!W$
zEdDW}cao)5a(g3^J5nnu$y!UHX#96b&vkW~80xilqHg$>o6f^HlF9pC;Ib;om<^>)
z3&#2z&cMtgSN4u^euyssPTN%+kJ2}SK9=+xQA5<Q2A<E80yRXhR}vz=+*Z3gQa35$
z%Ts?W;*8FQL-9vKqZa>M^<~khAvsVdWq>ur#;6Bu5%IZ0t+rzO;df!Z8#MHfrhlo#
zz%(cID>xx+@Ac+c(!V8i$q;FhpvkBxhJ%-FMginQoz!|0I6=S3DH)<<T1ptXbWweI
zijj*PEt_i+eQzJL>uM-rRoLC+t2igf`Y|<E;Jny3EB7vhFcunX!9DdTx0@RA4HC$@
zk#XjLl!+Jq^MZ!h8!zOd3kC{Az_<)oq-`u+G*HrIzL`l`Jw~of3jVRfyy$?_h%5Dc
z>;Ad({_=sc^6xzT-L>nK(g_#I(CG(V;*TE}#I08Gt9D6>K&20HSUL!oPU`wj5~y-m
zTP#%$`}UkFhjW`$<tPm0Rym3+oAVuoRSAc)s@Gw!N;<4nZ#tx!f=6zSq{X)&Y{Xsj
zZ@Nxr?Q~IG7<V1wJQRWJb~cecbl)zWbWwWUA9S6Lis*9TF2!AsxO?yvDjn8smP&^`
zl(Q<~uvYduBy8XXp)n(4<bzDe(vU!sk)LcPo%iBN`9qe~`ADoOs5L~4I=q4;A7rMI
z46Zud|Ad?3)=@+)@k6=Wb1I2nCGoQ?fgUGpuvdayCCF74EW9BGSzWj<;obmUU&m8e
zI97)zvvqJIx<~|H$b6e*VCQx!Z(B}delL_KNjV)>2!a=KVaU-3KeIl|jf#+tm6^0H
z2;Yt~=^)O>pyO^lO&u@=ynJ{q!+jiS#y=s!j+^RT?I?W#Zc}+fRci_?>u!!k+d<#o
z;I8;U*Z&nyD(@CLy_SHZuEtSc<M1uHSxuG08_I35Y?HR<kiLJzy#xAw{^#`l;Lqr*
z8Q15Bj@jTg2z`F)qPMosXSsBXN>>(G-gGq8I9Ap;)-*VLYqH@b4qsz7jQEH1voDW3
zR#rNis?e6y2D?zMHlXV+goSE{bCILC(&1{PrVwW1-k(Wtpmf+3veY_m*RI+AWB*Xs
zDL-8<+|++QLmAYzrjBZK{D<LGrvJMT-H3Y@@-^W(F7x=9Oa-FJr&dDdNJi!sZCGk~
z{T7T+S*lAX?qTHrt4#h(KmH*)k<pQB3Tm?5L7?(9lEVrtl##zhc`nNjZ&4mP3bB|4
zb~VU9ajdF9Tpi-F3}->kmx-I7DQCL{eAeWfovymzF*9JY+zQzy<R@O^oR*AnB7Hj`
z6M+ncNygCkSVmt}0STiw!#*Ux1=%oUzd2F1AtPHeUbYwMBN@G$aI8XlZ{ktw2v+8-
z$;sC3#yOz|*~sTEX}W#|x&+khCLNV<jo;qf{T4Zjvfa<nu@>2HS5DRs+s~MszfA78
zkjwfj`3d>!F2qe9>x)&iRW`00>oga!RHyKuu0Kr@x8h=1al=Suj!BIWZ%4ilh{dh)
zegDScy}BSr7H^ECu565%yYTd$({(!DG27i3zcF8gB${z<R|oQSs>2%O{UNQgZe>fg
z!<Tc;atmj#E^r~sO5CrU*Y!v6r2HZHu+xHxCez1Be-QWWY@jipWCo#QAj2cmKViS+
z>3qy_x650R$s4<<>(#fn-?h&F-EXcd`&Ow?x<#O{|2t1_ST|?GfBVkbbw3gwZ>Vwk
z8XtE-7r!_GPJk2O1ULasfD_;ZH~~(86W|0m0ZxDu-~>1UPJk2O1ULasfD_;ZH~~(8
z6W|0m0ZxDu-~>1UPJk2O1ULasfD_;ZH~~(86W|0m0ZxDu-~>1UPJk2O1ULasfD_;Z
zH~~(86W|0m0ZxDu-~>1UPJk2O1ULasfD_;ZH~~(86W|0m0ZxDu-~>1UPJk2O1ULas
zfD_;ZH~~(86W|0m0ZxDu-~>1UPJk2O1ULasfD_;ZH~~(86W|0m0ZxDu-~>1UPJk2O
z1ULasfD_;ZH~~(86W|0m0ZxDu-~>1UPJk2O1ULasfD_;ZH~~(86W|0m0ZxDu-~>1U
zPJk2O1ULasfD_;ZH~~(86W|0m0ZxDu-~>1UPJk2O1ULasfD_;ZH~~(86W|0m0ZxDu
z-~>1UPJk2O1ULasfD_;ZH~~(86W|0m0ZxDu-~>1UPJk2O1ULasfD_;ZH~~(86W|0m
z0Z!od1Ux-$$J=_^2HLc?{{JUzm0dl`OkLOi(JspO^xUV&;+-j7IrD2oSp}ujDU3^>
o5d@0NUb>FZ&)-2Do-dnE`R8)xT^9bc5QmajO4XIv_+RY*0|l%Fi~s-t

literal 0
HcmV?d00001

diff --git a/roms/Makefile b/roms/Makefile
index 6cf07d3b44..7c672536e4 100644
--- a/roms/Makefile
+++ b/roms/Makefile
@@ -67,6 +67,7 @@ default:
 	@echo "  opensbi32-virt     -- update OpenSBI for 32-bit virt machine"
 	@echo "  opensbi64-virt     -- update OpenSBI for 64-bit virt machine"
 	@echo "  opensbi64-sifive_u -- update OpenSBI for 64-bit sifive_u machine"
+	@echo "  bios-microvm       -- update bios-microvm.bin (qboot)"
 	@echo "  clean              -- delete the files generated by the previous" \
 	                              "build targets"
 
@@ -185,6 +186,10 @@ opensbi64-sifive_u:
 		PLATFORM="sifive/fu540"
 	cp opensbi/build/platform/sifive/fu540/firmware/fw_jump.bin ../pc-bios/opensbi-riscv64-sifive_u-fw_jump.bin
 
+bios-microvm:
+	$(MAKE) -C qboot
+	cp qboot/bios.bin ../pc-bios/bios-microvm.bin
+
 clean:
 	rm -rf seabios/.config seabios/out seabios/builds
 	$(MAKE) -C sgabios clean
@@ -197,3 +202,4 @@ clean:
 	$(MAKE) -C skiboot clean
 	$(MAKE) -f Makefile.edk2 clean
 	$(MAKE) -C opensbi clean
+	$(MAKE) -C qboot clean
diff --git a/roms/qboot b/roms/qboot
new file mode 160000
index 0000000000..cb1c49e0cf
--- /dev/null
+++ b/roms/qboot
@@ -0,0 +1 @@
+Subproject commit cb1c49e0cfac99b9961d136ac0194da62c28cf64
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 09/10] docs/microvm.rst: document the new microvm machine type
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (7 preceding siblings ...)
  2019-10-04  9:37 ` [PATCH v6 08/10] roms: add microvm-bios (qboot) as binary and git submodule Sergio Lopez
@ 2019-10-04  9:37 ` Sergio Lopez
  2019-10-04  9:37 ` [PATCH v6 10/10] hw/i386: Introduce the " Sergio Lopez
                   ` (4 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

Document the new microvm machine type.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 docs/microvm.rst | 98 ++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 98 insertions(+)
 create mode 100644 docs/microvm.rst

diff --git a/docs/microvm.rst b/docs/microvm.rst
new file mode 100644
index 0000000000..dc36ecf7c3
--- /dev/null
+++ b/docs/microvm.rst
@@ -0,0 +1,98 @@
+====================
+Microvm Machine Type
+====================
+
+Microvm is a machine type inspired by ``Firecracker`` and constructed
+after the its machine model.
+
+It's a minimalist machine type without ``PCI`` nor ``ACPI`` support,
+designed for short-lived guests. Microvm also establishes a baseline
+for benchmarking and optimizing both QEMU and guest operating systems,
+since it is optimized for both boot time and footprint.
+
+
+Supported devices
+-----------------
+
+The microvm machine type supports the following devices:
+
+- ISA bus
+- i8259 PIC (optional)
+- i8254 PIT (optional)
+- MC146818 RTC (optional)
+- One ISA serial port (optional)
+- LAPIC
+- IOAPIC (with kernel-irqchip=split by default)
+- kvmclock (if using KVM)
+- fw_cfg
+- Up to eight virtio-mmio devices (configured by the user)
+
+
+Using the microvm machine type
+------------------------------
+
+Machine-specific options
+~~~~~~~~~~~~~~~~~~~~~~~~
+
+It supports the following machine-specific options:
+
+- microvm.x-option-roms=bool (Set off to disable loading option ROMs)
+- microvm.pit=OnOffAuto (Enable i8254 PIT)
+- microvm.isa-serial=bool (Set off to disable the instantiation an ISA serial port)
+- microvm.pic=OnOffAuto (Enable i8259 PIC)
+- microvm.rtc=OnOffAuto (Enable MC146818 RTC)
+- microvm.auto-kernel-cmdline=bool (Set off to disable adding virtio-mmio devices to the kernel cmdline)
+
+
+Boot options
+~~~~~~~~~~~~
+
+By default, microvm uses ``qboot`` as its BIOS, to obtain better boot
+times, but it's also compatible with ``SeaBIOS``.
+
+As no current FW is able to boot from a block device using
+``virtio-mmio`` as its transport, a microvm-based VM needs to be run
+using a host-side kernel and, optionally, an initrd image.
+
+
+Running a microvm-based VM
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+By default, microvm aims for maximum compatibility, enabling both
+legacy and non-legacy devices. In this example, a VM is created
+without passing any additional machine-specific option, using the
+legacy ``ISA serial`` device as console::
+
+  $ qemu-system-x86_64 -M microvm \
+     -enable-kvm -cpu host -m 512m -smp 2 \
+     -kernel vmlinux -append "earlyprintk=ttyS0 console=ttyS0 root=/dev/vda" \
+     -nodefaults -no-user-config -nographic \
+     -serial stdio \
+     -drive id=test,file=test.img,format=raw,if=none \
+     -device virtio-blk-device,drive=test \
+     -netdev tap,id=tap0,script=no,downscript=no \
+     -device virtio-net-device,netdev=tap0
+
+While the example above works, you might be interested in reducing the
+footprint further by disabling some legacy devices. If you're using
+``KVM``, you can disable the ``RTC``, making the Guest rely on
+``kvmclock`` exclusively. Additionally, if your host's CPUs have the
+``TSC_DEADLINE`` feature, you can also disable both the i8259 PIC and
+the i8254 PIT (make sure you're also emulating a CPU with such feature
+in the guest).
+
+This is an example of a VM with all optional legacy features
+disabled::
+
+  $ qemu-system-x86_64 \
+     -M microvm,x-option-roms=off,pit=off,pic=off,isa-serial=off,rtc=off \
+     -enable-kvm -cpu host -m 512m -smp 2 \
+     -kernel vmlinux -append "console=hvc0 root=/dev/vda" \
+     -nodefaults -no-user-config -nographic \
+     -chardev stdio,id=virtiocon0,server \
+     -device virtio-serial-device \
+     -device virtconsole,chardev=virtiocon0 \
+     -drive id=test,file=test.img,format=raw,if=none \
+     -device virtio-blk-device,drive=test \
+     -netdev tap,id=tap0,script=no,downscript=no \
+     -device virtio-net-device,netdev=tap0
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* [PATCH v6 10/10] hw/i386: Introduce the microvm machine type
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (8 preceding siblings ...)
  2019-10-04  9:37 ` [PATCH v6 09/10] docs/microvm.rst: document the new microvm machine type Sergio Lopez
@ 2019-10-04  9:37 ` Sergio Lopez
  2019-10-04 13:57 ` [PATCH v6 00/10] " Michael S. Tsirkin
                   ` (3 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Sergio Lopez @ 2019-10-04  9:37 UTC (permalink / raw)
  To: qemu-devel
  Cc: ehabkost, Sergio Lopez, mst, lersek, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

Microvm is a machine type inspired by Firecracker and constructed
after the its machine model.

It's a minimalist machine type without PCI nor ACPI support, designed
for short-lived guests. Microvm also establishes a baseline for
benchmarking and optimizing both QEMU and guest operating systems,
since it is optimized for both boot time and footprint.

Signed-off-by: Sergio Lopez <slp@redhat.com>
---
 default-configs/i386-softmmu.mak |   1 +
 include/hw/i386/microvm.h        |  83 +++++
 hw/i386/microvm.c                | 574 +++++++++++++++++++++++++++++++
 hw/i386/Kconfig                  |   4 +
 hw/i386/Makefile.objs            |   1 +
 5 files changed, 663 insertions(+)
 create mode 100644 include/hw/i386/microvm.h
 create mode 100644 hw/i386/microvm.c

diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmmu.mak
index 4229900f57..4cc64dafa2 100644
--- a/default-configs/i386-softmmu.mak
+++ b/default-configs/i386-softmmu.mak
@@ -28,3 +28,4 @@
 CONFIG_ISAPC=y
 CONFIG_I440FX=y
 CONFIG_Q35=y
+CONFIG_MICROVM=y
diff --git a/include/hw/i386/microvm.h b/include/hw/i386/microvm.h
new file mode 100644
index 0000000000..faaa2e60b8
--- /dev/null
+++ b/include/hw/i386/microvm.h
@@ -0,0 +1,83 @@
+/*
+ * Copyright (c) 2018 Intel Corporation
+ * Copyright (c) 2019 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef HW_I386_MICROVM_H
+#define HW_I386_MICROVM_H
+
+#include "qemu-common.h"
+#include "exec/hwaddr.h"
+#include "qemu/notify.h"
+
+#include "hw/boards.h"
+#include "hw/i386/x86.h"
+
+/* Microvm memory layout */
+#define PVH_START_INFO        0x6000
+#define MEMMAP_START          0x7000
+#define MODLIST_START         0x7800
+#define BOOT_STACK_POINTER    0x8ff0
+#define PML4_START            0x9000
+#define PDPTE_START           0xa000
+#define PDE_START             0xb000
+#define KERNEL_CMDLINE_START  0x20000
+#define EBDA_START            0x9fc00
+#define HIMEM_START           0x100000
+
+/* Platform virtio definitions */
+#define VIRTIO_MMIO_BASE      0xc0000000
+#define VIRTIO_IRQ_BASE       5
+#define VIRTIO_NUM_TRANSPORTS 8
+#define VIRTIO_CMDLINE_MAXLEN 64
+
+/* Machine type options */
+#define MICROVM_MACHINE_PIT                 "pit"
+#define MICROVM_MACHINE_PIC                 "pic"
+#define MICROVM_MACHINE_RTC                 "rtc"
+#define MICROVM_MACHINE_ISA_SERIAL          "isa-serial"
+#define MICROVM_MACHINE_OPTION_ROMS         "x-option-roms"
+#define MICROVM_MACHINE_AUTO_KERNEL_CMDLINE "auto-kernel-cmdline"
+
+typedef struct {
+    X86MachineClass parent;
+    HotplugHandler *(*orig_hotplug_handler)(MachineState *machine,
+                                           DeviceState *dev);
+} MicrovmMachineClass;
+
+typedef struct {
+    X86MachineState parent;
+
+    /* Machine type options */
+    OnOffAuto pic;
+    OnOffAuto pit;
+    OnOffAuto rtc;
+    bool isa_serial;
+    bool option_roms;
+    bool auto_kernel_cmdline;
+
+    /* Machine state */
+    bool kernel_cmdline_fixed;
+} MicrovmMachineState;
+
+#define TYPE_MICROVM_MACHINE   MACHINE_TYPE_NAME("microvm")
+#define MICROVM_MACHINE(obj) \
+    OBJECT_CHECK(MicrovmMachineState, (obj), TYPE_MICROVM_MACHINE)
+#define MICROVM_MACHINE_GET_CLASS(obj) \
+    OBJECT_GET_CLASS(MicrovmMachineClass, obj, TYPE_MICROVM_MACHINE)
+#define MICROVM_MACHINE_CLASS(class) \
+    OBJECT_CLASS_CHECK(MicrovmMachineClass, class, TYPE_MICROVM_MACHINE)
+
+#endif
diff --git a/hw/i386/microvm.c b/hw/i386/microvm.c
new file mode 100644
index 0000000000..cef3ced9c3
--- /dev/null
+++ b/hw/i386/microvm.c
@@ -0,0 +1,574 @@
+/*
+ * Copyright (c) 2018 Intel Corporation
+ * Copyright (c) 2019 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2 or later, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "qemu/osdep.h"
+#include "qemu/error-report.h"
+#include "qemu/cutils.h"
+#include "qemu/units.h"
+#include "qapi/error.h"
+#include "qapi/visitor.h"
+#include "qapi/qapi-visit-common.h"
+#include "sysemu/sysemu.h"
+#include "sysemu/cpus.h"
+#include "sysemu/numa.h"
+#include "sysemu/reset.h"
+
+#include "hw/loader.h"
+#include "hw/irq.h"
+#include "hw/kvm/clock.h"
+#include "hw/i386/microvm.h"
+#include "hw/i386/x86.h"
+#include "hw/i386/pc.h"
+#include "target/i386/cpu.h"
+#include "hw/timer/i8254.h"
+#include "hw/timer/mc146818rtc.h"
+#include "hw/char/serial.h"
+#include "hw/i386/topology.h"
+#include "hw/i386/e820_memory_layout.h"
+#include "hw/i386/fw_cfg.h"
+#include "hw/virtio/virtio-mmio.h"
+
+#include "cpu.h"
+#include "elf.h"
+#include "kvm_i386.h"
+#include "hw/xen/start_info.h"
+
+#define MICROVM_BIOS_FILENAME "bios-microvm.bin"
+
+static void microvm_set_rtc(MicrovmMachineState *mms, ISADevice *s)
+{
+    X86MachineState *x86ms = X86_MACHINE(mms);
+    int val;
+
+    val = MIN(x86ms->below_4g_mem_size / KiB, 640);
+    rtc_set_memory(s, 0x15, val);
+    rtc_set_memory(s, 0x16, val >> 8);
+    /* extended memory (next 64MiB) */
+    if (x86ms->below_4g_mem_size > 1 * MiB) {
+        val = (x86ms->below_4g_mem_size - 1 * MiB) / KiB;
+    } else {
+        val = 0;
+    }
+    if (val > 65535) {
+        val = 65535;
+    }
+    rtc_set_memory(s, 0x17, val);
+    rtc_set_memory(s, 0x18, val >> 8);
+    rtc_set_memory(s, 0x30, val);
+    rtc_set_memory(s, 0x31, val >> 8);
+    /* memory between 16MiB and 4GiB */
+    if (x86ms->below_4g_mem_size > 16 * MiB) {
+        val = (x86ms->below_4g_mem_size - 16 * MiB) / (64 * KiB);
+    } else {
+        val = 0;
+    }
+    if (val > 65535) {
+        val = 65535;
+    }
+    rtc_set_memory(s, 0x34, val);
+    rtc_set_memory(s, 0x35, val >> 8);
+    /* memory above 4GiB */
+    val = x86ms->above_4g_mem_size / 65536;
+    rtc_set_memory(s, 0x5b, val);
+    rtc_set_memory(s, 0x5c, val >> 8);
+    rtc_set_memory(s, 0x5d, val >> 16);
+}
+
+static void microvm_gsi_handler(void *opaque, int n, int level)
+{
+    GSIState *s = opaque;
+
+    qemu_set_irq(s->ioapic_irq[n], level);
+}
+
+static void microvm_devices_init(MicrovmMachineState *mms)
+{
+    X86MachineState *x86ms = X86_MACHINE(mms);
+    ISABus *isa_bus;
+    ISADevice *rtc_state;
+    GSIState *gsi_state;
+    int i;
+
+    /* Core components */
+
+    gsi_state = g_malloc0(sizeof(*gsi_state));
+    if (mms->pic == ON_OFF_AUTO_ON || mms->pic == ON_OFF_AUTO_AUTO) {
+        x86ms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
+    } else {
+        x86ms->gsi = qemu_allocate_irqs(microvm_gsi_handler,
+                                        gsi_state, GSI_NUM_PINS);
+    }
+
+    isa_bus = isa_bus_new(NULL, get_system_memory(), get_system_io(),
+                          &error_abort);
+    isa_bus_irqs(isa_bus, x86ms->gsi);
+
+    ioapic_init_gsi(gsi_state, "machine");
+
+    kvmclock_create();
+
+    for (i = 0; i < VIRTIO_NUM_TRANSPORTS; i++) {
+        sysbus_create_simple("virtio-mmio",
+                             VIRTIO_MMIO_BASE + i * 512,
+                             x86ms->gsi[VIRTIO_IRQ_BASE + i]);
+    }
+
+    /* Optional and legacy devices */
+
+    if (mms->pic == ON_OFF_AUTO_ON || mms->pic == ON_OFF_AUTO_AUTO) {
+        qemu_irq *i8259;
+
+        i8259 = i8259_init(isa_bus, pc_allocate_cpu_irq());
+        for (i = 0; i < ISA_NUM_IRQS; i++) {
+            gsi_state->i8259_irq[i] = i8259[i];
+        }
+        g_free(i8259);
+    }
+
+    if (mms->pit == ON_OFF_AUTO_ON || mms->pit == ON_OFF_AUTO_AUTO) {
+        if (kvm_pit_in_kernel()) {
+            kvm_pit_init(isa_bus, 0x40);
+        } else {
+            i8254_pit_init(isa_bus, 0x40, 0, NULL);
+        }
+    }
+
+    if (mms->rtc == ON_OFF_AUTO_ON ||
+        (mms->rtc == ON_OFF_AUTO_AUTO && !kvm_enabled())) {
+        rtc_state = mc146818_rtc_init(isa_bus, 2000, NULL);
+        microvm_set_rtc(mms, rtc_state);
+    }
+
+    if (mms->isa_serial) {
+        serial_hds_isa_init(isa_bus, 0, 1);
+    }
+
+    if (bios_name == NULL) {
+        bios_name = MICROVM_BIOS_FILENAME;
+    }
+    x86_bios_rom_init(get_system_memory(), true);
+}
+
+static void microvm_memory_init(MicrovmMachineState *mms)
+{
+    MachineState *machine = MACHINE(mms);
+    X86MachineState *x86ms = X86_MACHINE(mms);
+    MemoryRegion *ram, *ram_below_4g, *ram_above_4g;
+    MemoryRegion *system_memory = get_system_memory();
+    FWCfgState *fw_cfg;
+    ram_addr_t lowmem;
+    int i;
+
+    /*
+     * Check whether RAM fits below 4G (leaving 1/2 GByte for IO memory
+     * and 256 Mbytes for PCI Express Enhanced Configuration Access Mapping
+     * also known as MMCFG).
+     * If it doesn't, we need to split it in chunks below and above 4G.
+     * In any case, try to make sure that guest addresses aligned at
+     * 1G boundaries get mapped to host addresses aligned at 1G boundaries.
+     */
+    if (machine->ram_size >= 0xb0000000) {
+        lowmem = 0x80000000;
+    } else {
+        lowmem = 0xb0000000;
+    }
+
+    /*
+     * Handle the machine opt max-ram-below-4g.  It is basically doing
+     * min(qemu limit, user limit).
+     */
+    if (!x86ms->max_ram_below_4g) {
+        x86ms->max_ram_below_4g = 4 * GiB;
+    }
+    if (lowmem > x86ms->max_ram_below_4g) {
+        lowmem = x86ms->max_ram_below_4g;
+        if (machine->ram_size - lowmem > lowmem &&
+            lowmem & (1 * GiB - 1)) {
+            warn_report("There is possibly poor performance as the ram size "
+                        " (0x%" PRIx64 ") is more then twice the size of"
+                        " max-ram-below-4g (%"PRIu64") and"
+                        " max-ram-below-4g is not a multiple of 1G.",
+                        (uint64_t)machine->ram_size, x86ms->max_ram_below_4g);
+        }
+    }
+
+    if (machine->ram_size > lowmem) {
+        x86ms->above_4g_mem_size = machine->ram_size - lowmem;
+        x86ms->below_4g_mem_size = lowmem;
+    } else {
+        x86ms->above_4g_mem_size = 0;
+        x86ms->below_4g_mem_size = machine->ram_size;
+    }
+
+    ram = g_malloc(sizeof(*ram));
+    memory_region_allocate_system_memory(ram, NULL, "microvm.ram",
+                                         machine->ram_size);
+
+    ram_below_4g = g_malloc(sizeof(*ram_below_4g));
+    memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram,
+                             0, x86ms->below_4g_mem_size);
+    memory_region_add_subregion(system_memory, 0, ram_below_4g);
+
+    e820_add_entry(0, x86ms->below_4g_mem_size, E820_RAM);
+
+    if (x86ms->above_4g_mem_size > 0) {
+        ram_above_4g = g_malloc(sizeof(*ram_above_4g));
+        memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
+                                 x86ms->below_4g_mem_size,
+                                 x86ms->above_4g_mem_size);
+        memory_region_add_subregion(system_memory, 0x100000000ULL,
+                                    ram_above_4g);
+        e820_add_entry(0x100000000ULL, x86ms->above_4g_mem_size, E820_RAM);
+    }
+
+    fw_cfg = fw_cfg_init_io_dma(FW_CFG_IO_BASE, FW_CFG_IO_BASE + 4,
+                                &address_space_memory);
+
+    fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, machine->smp.cpus);
+    fw_cfg_add_i16(fw_cfg, FW_CFG_MAX_CPUS, machine->smp.max_cpus);
+    fw_cfg_add_i64(fw_cfg, FW_CFG_RAM_SIZE, (uint64_t)machine->ram_size);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_IRQ0_OVERRIDE, kvm_allows_irq0_override());
+    fw_cfg_add_bytes(fw_cfg, FW_CFG_E820_TABLE,
+                     &e820_reserve, sizeof(e820_reserve));
+    fw_cfg_add_file(fw_cfg, "etc/e820", e820_table,
+                    sizeof(struct e820_entry) * e820_get_num_entries());
+
+    rom_set_fw(fw_cfg);
+
+    x86_load_linux(x86ms, fw_cfg, 0, true, true);
+
+    if (mms->option_roms) {
+        for (i = 0; i < nb_option_roms; i++) {
+            rom_add_option(option_rom[i].name, option_rom[i].bootindex);
+        }
+    }
+
+    x86ms->fw_cfg = fw_cfg;
+    x86ms->ioapic_as = &address_space_memory;
+}
+
+static gchar *microvm_get_mmio_cmdline(gchar *name)
+{
+    gchar *cmdline;
+    gchar *separator;
+    long int index;
+    int ret;
+
+    separator = g_strrstr(name, ".");
+    if (!separator) {
+        return NULL;
+    }
+
+    if (qemu_strtol(separator + 1, NULL, 10, &index) != 0) {
+        return NULL;
+    }
+
+    cmdline = g_malloc0(VIRTIO_CMDLINE_MAXLEN);
+    ret = g_snprintf(cmdline, VIRTIO_CMDLINE_MAXLEN,
+                     " virtio_mmio.device=512@0x%lx:%ld",
+                     VIRTIO_MMIO_BASE + index * 512,
+                     VIRTIO_IRQ_BASE + index);
+    if (ret < 0 || ret >= VIRTIO_CMDLINE_MAXLEN) {
+        g_free(cmdline);
+        return NULL;
+    }
+
+    return cmdline;
+}
+
+static void microvm_fix_kernel_cmdline(MachineState *machine)
+{
+    X86MachineState *x86ms = X86_MACHINE(machine);
+    BusState *bus;
+    BusChild *kid;
+    char *cmdline;
+
+    /*
+     * Find MMIO transports with attached devices, and add them to the kernel
+     * command line.
+     *
+     * Yes, this is a hack, but one that heavily improves the UX without
+     * introducing any significant issues.
+     */
+    cmdline = g_strdup(machine->kernel_cmdline);
+    bus = sysbus_get_default();
+    QTAILQ_FOREACH(kid, &bus->children, sibling) {
+        DeviceState *dev = kid->child;
+        ObjectClass *class = object_get_class(OBJECT(dev));
+
+        if (class == object_class_by_name(TYPE_VIRTIO_MMIO)) {
+            VirtIOMMIOProxy *mmio = VIRTIO_MMIO(OBJECT(dev));
+            VirtioBusState *mmio_virtio_bus = &mmio->bus;
+            BusState *mmio_bus = &mmio_virtio_bus->parent_obj;
+
+            if (!QTAILQ_EMPTY(&mmio_bus->children)) {
+                gchar *mmio_cmdline = microvm_get_mmio_cmdline(mmio_bus->name);
+                if (mmio_cmdline) {
+                    char *newcmd = g_strjoin(NULL, cmdline, mmio_cmdline, NULL);
+                    g_free(mmio_cmdline);
+                    g_free(cmdline);
+                    cmdline = newcmd;
+                }
+            }
+        }
+    }
+
+    fw_cfg_modify_i32(x86ms->fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(cmdline) + 1);
+    fw_cfg_modify_string(x86ms->fw_cfg, FW_CFG_CMDLINE_DATA, cmdline);
+}
+
+static void microvm_machine_state_init(MachineState *machine)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(machine);
+    X86MachineState *x86ms = X86_MACHINE(machine);
+    Error *local_err = NULL;
+
+    if (machine->kernel_filename == NULL) {
+        error_report("missing kernel image file name, required by microvm");
+        exit(1);
+    }
+
+    microvm_memory_init(mms);
+
+    x86_cpus_init(x86ms, CPU_VERSION_LATEST);
+    if (local_err) {
+        error_report_err(local_err);
+        exit(1);
+    }
+
+    microvm_devices_init(mms);
+}
+
+static void microvm_machine_reset(MachineState *machine)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(machine);
+    CPUState *cs;
+    X86CPU *cpu;
+
+    if (mms->auto_kernel_cmdline && !mms->kernel_cmdline_fixed) {
+        microvm_fix_kernel_cmdline(machine);
+        mms->kernel_cmdline_fixed = true;
+    }
+
+    qemu_devices_reset();
+
+    CPU_FOREACH(cs) {
+        cpu = X86_CPU(cs);
+
+        if (cpu->apic_state) {
+            device_reset(cpu->apic_state);
+        }
+    }
+}
+
+static void microvm_machine_get_pic(Object *obj, Visitor *v, const char *name,
+                                    void *opaque, Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+    OnOffAuto pic = mms->pic;
+
+    visit_type_OnOffAuto(v, name, &pic, errp);
+}
+
+static void microvm_machine_set_pic(Object *obj, Visitor *v, const char *name,
+                                    void *opaque, Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+
+    visit_type_OnOffAuto(v, name, &mms->pic, errp);
+}
+
+static void microvm_machine_get_pit(Object *obj, Visitor *v, const char *name,
+                                    void *opaque, Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+    OnOffAuto pit = mms->pit;
+
+    visit_type_OnOffAuto(v, name, &pit, errp);
+}
+
+static void microvm_machine_set_pit(Object *obj, Visitor *v, const char *name,
+                                    void *opaque, Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+
+    visit_type_OnOffAuto(v, name, &mms->pit, errp);
+}
+
+static void microvm_machine_get_rtc(Object *obj, Visitor *v, const char *name,
+                                    void *opaque, Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+    OnOffAuto rtc = mms->rtc;
+
+    visit_type_OnOffAuto(v, name, &rtc, errp);
+}
+
+static void microvm_machine_set_rtc(Object *obj, Visitor *v, const char *name,
+                                    void *opaque, Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+
+    visit_type_OnOffAuto(v, name, &mms->rtc, errp);
+}
+
+static bool microvm_machine_get_isa_serial(Object *obj, Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+
+    return mms->isa_serial;
+}
+
+static void microvm_machine_set_isa_serial(Object *obj, bool value,
+                                           Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+
+    mms->isa_serial = value;
+}
+
+static bool microvm_machine_get_option_roms(Object *obj, Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+
+    return mms->option_roms;
+}
+
+static void microvm_machine_set_option_roms(Object *obj, bool value,
+                                            Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+
+    mms->option_roms = value;
+}
+
+static bool microvm_machine_get_auto_kernel_cmdline(Object *obj, Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+
+    return mms->auto_kernel_cmdline;
+}
+
+static void microvm_machine_set_auto_kernel_cmdline(Object *obj, bool value,
+                                                    Error **errp)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+
+    mms->auto_kernel_cmdline = value;
+}
+
+static void microvm_machine_initfn(Object *obj)
+{
+    MicrovmMachineState *mms = MICROVM_MACHINE(obj);
+
+    /* Configuration */
+    mms->pic = ON_OFF_AUTO_AUTO;
+    mms->pit = ON_OFF_AUTO_AUTO;
+    mms->rtc = ON_OFF_AUTO_AUTO;
+    mms->isa_serial = true;
+    mms->option_roms = true;
+    mms->auto_kernel_cmdline = true;
+
+    /* State */
+    mms->kernel_cmdline_fixed = false;
+}
+
+static void microvm_class_init(ObjectClass *oc, void *data)
+{
+    MachineClass *mc = MACHINE_CLASS(oc);
+
+    mc->init = microvm_machine_state_init;
+
+    mc->family = "microvm_i386";
+    mc->desc = "Microvm (i386)";
+    mc->units_per_default_bus = 1;
+    mc->no_floppy = 1;
+    mc->max_cpus = 288;
+    mc->has_hotpluggable_cpus = false;
+    mc->auto_enable_numa_with_memhp = false;
+    mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
+    mc->nvdimm_supported = false;
+
+    /* Avoid relying too much on kernel components */
+    mc->default_kernel_irqchip_split = true;
+
+    /* Machine class handlers */
+    mc->reset = microvm_machine_reset;
+
+    object_class_property_add(oc, MICROVM_MACHINE_PIC, "OnOffAuto",
+                              microvm_machine_get_pic,
+                              microvm_machine_set_pic,
+                              NULL, NULL, &error_abort);
+    object_class_property_set_description(oc, MICROVM_MACHINE_PIC,
+        "Enable i8259 PIC", &error_abort);
+
+    object_class_property_add(oc, MICROVM_MACHINE_PIT, "OnOffAuto",
+                              microvm_machine_get_pit,
+                              microvm_machine_set_pit,
+                              NULL, NULL, &error_abort);
+    object_class_property_set_description(oc, MICROVM_MACHINE_PIT,
+        "Enable i8254 PIT", &error_abort);
+
+    object_class_property_add(oc, MICROVM_MACHINE_RTC, "OnOffAuto",
+                              microvm_machine_get_rtc,
+                              microvm_machine_set_rtc,
+                              NULL, NULL, &error_abort);
+    object_class_property_set_description(oc, MICROVM_MACHINE_RTC,
+        "Enable MC146818 RTC", &error_abort);
+
+    object_class_property_add_bool(oc, MICROVM_MACHINE_ISA_SERIAL,
+                                   microvm_machine_get_isa_serial,
+                                   microvm_machine_set_isa_serial,
+                                   &error_abort);
+    object_class_property_set_description(oc, MICROVM_MACHINE_ISA_SERIAL,
+        "Set off to disable the instantiation an ISA serial port",
+        &error_abort);
+
+    object_class_property_add_bool(oc, MICROVM_MACHINE_OPTION_ROMS,
+                                   microvm_machine_get_option_roms,
+                                   microvm_machine_set_option_roms,
+                                   &error_abort);
+    object_class_property_set_description(oc, MICROVM_MACHINE_OPTION_ROMS,
+        "Set off to disable loading option ROMs", &error_abort);
+
+    object_class_property_add_bool(oc, MICROVM_MACHINE_AUTO_KERNEL_CMDLINE,
+                                   microvm_machine_get_auto_kernel_cmdline,
+                                   microvm_machine_set_auto_kernel_cmdline,
+                                   &error_abort);
+    object_class_property_set_description(oc,
+        MICROVM_MACHINE_AUTO_KERNEL_CMDLINE,
+        "Set off to disable adding virtio-mmio devices to the kernel cmdline",
+        &error_abort);
+}
+
+static const TypeInfo microvm_machine_info = {
+    .name          = TYPE_MICROVM_MACHINE,
+    .parent        = TYPE_X86_MACHINE,
+    .instance_size = sizeof(MicrovmMachineState),
+    .instance_init = microvm_machine_initfn,
+    .class_size    = sizeof(MicrovmMachineClass),
+    .class_init    = microvm_class_init,
+    .interfaces = (InterfaceInfo[]) {
+         { }
+    },
+};
+
+static void microvm_machine_init(void)
+{
+    type_register_static(&microvm_machine_info);
+}
+type_init(microvm_machine_init);
diff --git a/hw/i386/Kconfig b/hw/i386/Kconfig
index c5c9d4900e..d399dcba52 100644
--- a/hw/i386/Kconfig
+++ b/hw/i386/Kconfig
@@ -92,6 +92,10 @@ config Q35
     select SMBIOS
     select FW_CFG_DMA
 
+config MICROVM
+    bool
+    select VIRTIO_MMIO
+
 config VTD
     bool
 
diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
index 7ed80a4853..0d195b5210 100644
--- a/hw/i386/Makefile.objs
+++ b/hw/i386/Makefile.objs
@@ -4,6 +4,7 @@ obj-y += x86.o
 obj-y += pc.o
 obj-$(CONFIG_I440FX) += pc_piix.o
 obj-$(CONFIG_Q35) += pc_q35.o
+obj-$(CONFIG_MICROVM) += microvm.o
 obj-y += fw_cfg.o pc_sysfw.o
 obj-y += x86-iommu.o
 obj-$(CONFIG_VTD) += intel_iommu.o
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 01/10] hw/virtio: Factorize virtio-mmio headers
  2019-10-04  9:37 ` [PATCH v6 01/10] hw/virtio: Factorize virtio-mmio headers Sergio Lopez
@ 2019-10-04  9:43   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-10-04  9:43 UTC (permalink / raw)
  To: Sergio Lopez, qemu-devel
  Cc: ehabkost, mst, kraxel, pbonzini, imammedo, sgarzare, lersek, rth

Hi Sergio,

On 10/4/19 11:37 AM, Sergio Lopez wrote:
> Put QOM and main struct definition in a separate header file, so it
> can be accessed from other components.
> 
> Signed-off-by: Sergio Lopez <slp@redhat.com>

Please collect/keep reviewer tags between iterations, this will save 
them time. Only reset (remove) them if you want the reviewer to review 
your patch carefully again.

Only cosmetic change since v5 which had:
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> ---
>   include/hw/virtio/virtio-mmio.h | 73 +++++++++++++++++++++++++++++++++
>   hw/virtio/virtio-mmio.c         | 48 +---------------------
>   2 files changed, 74 insertions(+), 47 deletions(-)
>   create mode 100644 include/hw/virtio/virtio-mmio.h
> 
> diff --git a/include/hw/virtio/virtio-mmio.h b/include/hw/virtio/virtio-mmio.h
> new file mode 100644
> index 0000000000..7dbfd03dcf
> --- /dev/null
> +++ b/include/hw/virtio/virtio-mmio.h
> @@ -0,0 +1,73 @@
> +/*
> + * Virtio MMIO bindings
> + *
> + * Copyright (c) 2011 Linaro Limited
> + *
> + * Author:
> + *  Peter Maydell <peter.maydell@linaro.org>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License; either version 2
> + * of the License, or (at your option) any later version.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License along
> + * with this program; if not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_VIRTIO_MMIO_H
> +#define HW_VIRTIO_MMIO_H
> +
> +#include "hw/virtio/virtio-bus.h"
> +
> +/* QOM macros */
> +/* virtio-mmio-bus */
> +#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus"
> +#define VIRTIO_MMIO_BUS(obj) \
> +        OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS)
> +#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \
> +        OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS)
> +#define VIRTIO_MMIO_BUS_CLASS(klass) \
> +        OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS)
> +
> +/* virtio-mmio */
> +#define TYPE_VIRTIO_MMIO "virtio-mmio"
> +#define VIRTIO_MMIO(obj) \
> +        OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO)
> +
> +#define VIRT_MAGIC 0x74726976 /* 'virt' */
> +#define VIRT_VERSION 2
> +#define VIRT_VERSION_LEGACY 1
> +#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */
> +
> +typedef struct VirtIOMMIOQueue {
> +    uint16_t num;
> +    bool enabled;
> +    uint32_t desc[2];
> +    uint32_t avail[2];
> +    uint32_t used[2];
> +} VirtIOMMIOQueue;
> +
> +typedef struct {
> +    /* Generic */
> +    SysBusDevice parent_obj;
> +    MemoryRegion iomem;
> +    qemu_irq irq;
> +    bool legacy;
> +    /* Guest accessible state needing migration and reset */
> +    uint32_t host_features_sel;
> +    uint32_t guest_features_sel;
> +    uint32_t guest_page_shift;
> +    /* virtio-bus */
> +    VirtioBusState bus;
> +    bool format_transport_address;
> +    /* Fields only used for non-legacy (v2) devices */
> +    uint32_t guest_features[2];
> +    VirtIOMMIOQueue vqs[VIRTIO_QUEUE_MAX];
> +} VirtIOMMIOProxy;
> +
> +#endif
> diff --git a/hw/virtio/virtio-mmio.c b/hw/virtio/virtio-mmio.c
> index 3d5ca0f667..94d934c44b 100644
> --- a/hw/virtio/virtio-mmio.c
> +++ b/hw/virtio/virtio-mmio.c
> @@ -29,57 +29,11 @@
>   #include "qemu/host-utils.h"
>   #include "qemu/module.h"
>   #include "sysemu/kvm.h"
> -#include "hw/virtio/virtio-bus.h"
> +#include "hw/virtio/virtio-mmio.h"
>   #include "qemu/error-report.h"
>   #include "qemu/log.h"
>   #include "trace.h"
>   
> -/* QOM macros */
> -/* virtio-mmio-bus */
> -#define TYPE_VIRTIO_MMIO_BUS "virtio-mmio-bus"
> -#define VIRTIO_MMIO_BUS(obj) \
> -        OBJECT_CHECK(VirtioBusState, (obj), TYPE_VIRTIO_MMIO_BUS)
> -#define VIRTIO_MMIO_BUS_GET_CLASS(obj) \
> -        OBJECT_GET_CLASS(VirtioBusClass, (obj), TYPE_VIRTIO_MMIO_BUS)
> -#define VIRTIO_MMIO_BUS_CLASS(klass) \
> -        OBJECT_CLASS_CHECK(VirtioBusClass, (klass), TYPE_VIRTIO_MMIO_BUS)
> -
> -/* virtio-mmio */
> -#define TYPE_VIRTIO_MMIO "virtio-mmio"
> -#define VIRTIO_MMIO(obj) \
> -        OBJECT_CHECK(VirtIOMMIOProxy, (obj), TYPE_VIRTIO_MMIO)
> -
> -#define VIRT_MAGIC 0x74726976 /* 'virt' */
> -#define VIRT_VERSION 2
> -#define VIRT_VERSION_LEGACY 1
> -#define VIRT_VENDOR 0x554D4551 /* 'QEMU' */
> -
> -typedef struct VirtIOMMIOQueue {
> -    uint16_t num;
> -    bool enabled;
> -    uint32_t desc[2];
> -    uint32_t avail[2];
> -    uint32_t used[2];
> -} VirtIOMMIOQueue;
> -
> -typedef struct {
> -    /* Generic */
> -    SysBusDevice parent_obj;
> -    MemoryRegion iomem;
> -    qemu_irq irq;
> -    bool legacy;
> -    /* Guest accessible state needing migration and reset */
> -    uint32_t host_features_sel;
> -    uint32_t guest_features_sel;
> -    uint32_t guest_page_shift;
> -    /* virtio-bus */
> -    VirtioBusState bus;
> -    bool format_transport_address;
> -    /* Fields only used for non-legacy (v2) devices */
> -    uint32_t guest_features[2];
> -    VirtIOMMIOQueue vqs[VIRTIO_QUEUE_MAX];
> -} VirtIOMMIOProxy;
> -
>   static bool virtio_mmio_ioeventfd_enabled(DeviceState *d)
>   {
>       return kvm_eventfds_enabled();
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 02/10] hw/i386/pc: rename functions shared with non-PC machines
  2019-10-04  9:37 ` [PATCH v6 02/10] hw/i386/pc: rename functions shared with non-PC machines Sergio Lopez
@ 2019-10-04  9:46   ` Philippe Mathieu-Daudé
  2019-10-04 11:24   ` Stefano Garzarella
  1 sibling, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-10-04  9:46 UTC (permalink / raw)
  To: Sergio Lopez, qemu-devel
  Cc: ehabkost, mst, kraxel, pbonzini, imammedo, sgarzare, lersek, rth

On 10/4/19 11:37 AM, Sergio Lopez wrote:
> The following functions are named *pc* but are not PC-machine specific
> but generic to the X86 architecture, rename them:
> 
>    load_linux                 -> x86_load_linux
>    pc_new_cpu                 -> x86_new_cpu
>    pc_cpus_init               -> x86_cpus_init
>    pc_cpu_index_to_props      -> x86_cpu_index_to_props
>    pc_get_default_cpu_node_id -> x86_get_default_cpu_node_id
>    pc_possible_cpu_arch_ids   -> x86_possible_cpu_arch_ids
>    old_pc_system_rom_init     -> x86_system_rom_init
> 
> Signed-off-by: Sergio Lopez <slp@redhat.com>

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> ---
>   include/hw/i386/pc.h |  2 +-
>   hw/i386/pc.c         | 28 ++++++++++++++--------------
>   hw/i386/pc_piix.c    |  2 +-
>   hw/i386/pc_q35.c     |  2 +-
>   hw/i386/pc_sysfw.c   |  6 +++---
>   5 files changed, 20 insertions(+), 20 deletions(-)
> 
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 6df4f4b6fb..d12f42e9e5 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -195,7 +195,7 @@ bool pc_machine_is_smm_enabled(PCMachineState *pcms);
>   void pc_register_ferr_irq(qemu_irq irq);
>   void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
>   
> -void pc_cpus_init(PCMachineState *pcms);
> +void x86_cpus_init(PCMachineState *pcms);
>   void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp);
>   void pc_smp_parse(MachineState *ms, QemuOpts *opts);
>   
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index bcda50efcc..fd08c6704b 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -1019,8 +1019,8 @@ static bool load_elfboot(const char *kernel_filename,
>       return true;
>   }
>   
> -static void load_linux(PCMachineState *pcms,
> -                       FWCfgState *fw_cfg)
> +static void x86_load_linux(PCMachineState *pcms,
> +                           FWCfgState *fw_cfg)
>   {
>       uint16_t protocol;
>       int setup_size, kernel_size, cmdline_size;
> @@ -1374,7 +1374,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
>       }
>   }
>   
> -static void pc_new_cpu(PCMachineState *pcms, int64_t apic_id, Error **errp)
> +static void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
>   {
>       Object *cpu = NULL;
>       Error *local_err = NULL;
> @@ -1490,14 +1490,14 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
>           return;
>       }
>   
> -    pc_new_cpu(PC_MACHINE(ms), apic_id, &local_err);
> +    x86_cpu_new(PC_MACHINE(ms), apic_id, &local_err);
>       if (local_err) {
>           error_propagate(errp, local_err);
>           return;
>       }
>   }
>   
> -void pc_cpus_init(PCMachineState *pcms)
> +void x86_cpus_init(PCMachineState *pcms)
>   {
>       int i;
>       const CPUArchIdList *possible_cpus;
> @@ -1518,7 +1518,7 @@ void pc_cpus_init(PCMachineState *pcms)
>                                                        ms->smp.max_cpus - 1) + 1;
>       possible_cpus = mc->possible_cpu_arch_ids(ms);
>       for (i = 0; i < ms->smp.cpus; i++) {
> -        pc_new_cpu(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
> +        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
>       }
>   }
>   
> @@ -1621,7 +1621,7 @@ void xen_load_linux(PCMachineState *pcms)
>       fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
>       rom_set_fw(fw_cfg);
>   
> -    load_linux(pcms, fw_cfg);
> +    x86_load_linux(pcms, fw_cfg);
>       for (i = 0; i < nb_option_roms; i++) {
>           assert(!strcmp(option_rom[i].name, "linuxboot.bin") ||
>                  !strcmp(option_rom[i].name, "linuxboot_dma.bin") ||
> @@ -1756,7 +1756,7 @@ void pc_memory_init(PCMachineState *pcms,
>       }
>   
>       if (linux_boot) {
> -        load_linux(pcms, fw_cfg);
> +        x86_load_linux(pcms, fw_cfg);
>       }
>   
>       for (i = 0; i < nb_option_roms; i++) {
> @@ -2678,7 +2678,7 @@ static void pc_machine_wakeup(MachineState *machine)
>   }
>   
>   static CpuInstanceProperties
> -pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> +x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>   {
>       MachineClass *mc = MACHINE_GET_CLASS(ms);
>       const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> @@ -2687,7 +2687,7 @@ pc_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>       return possible_cpus->cpus[cpu_index].props;
>   }
>   
> -static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
> +static int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
>   {
>      X86CPUTopoInfo topo;
>      PCMachineState *pcms = PC_MACHINE(ms);
> @@ -2699,7 +2699,7 @@ static int64_t pc_get_default_cpu_node_id(const MachineState *ms, int idx)
>      return topo.pkg_id % ms->numa_state->num_nodes;
>   }
>   
> -static const CPUArchIdList *pc_possible_cpu_arch_ids(MachineState *ms)
> +static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
>   {
>       PCMachineState *pcms = PC_MACHINE(ms);
>       int i;
> @@ -2801,9 +2801,9 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
>       assert(!mc->get_hotplug_handler);
>       mc->get_hotplug_handler = pc_get_hotplug_handler;
>       mc->hotplug_allowed = pc_hotplug_allowed;
> -    mc->cpu_index_to_instance_props = pc_cpu_index_to_props;
> -    mc->get_default_cpu_node_id = pc_get_default_cpu_node_id;
> -    mc->possible_cpu_arch_ids = pc_possible_cpu_arch_ids;
> +    mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
> +    mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
> +    mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
>       mc->auto_enable_numa_with_memhp = true;
>       mc->has_hotpluggable_cpus = true;
>       mc->default_boot_order = "cad";
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 6824b72124..de09e076cd 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -152,7 +152,7 @@ static void pc_init1(MachineState *machine,
>           }
>       }
>   
> -    pc_cpus_init(pcms);
> +    x86_cpus_init(pcms);
>   
>       if (kvm_enabled() && pcmc->kvmclock_enabled) {
>           kvmclock_create();
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 8fad20f314..894989b64e 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -179,7 +179,7 @@ static void pc_q35_init(MachineState *machine)
>           xen_hvm_init(pcms, &ram_memory);
>       }
>   
> -    pc_cpus_init(pcms);
> +    x86_cpus_init(pcms);
>   
>       kvmclock_create();
>   
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index a9983f0bfb..28cb1f63c9 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -211,7 +211,7 @@ static void pc_system_flash_map(PCMachineState *pcms,
>       }
>   }
>   
> -static void old_pc_system_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
> +static void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
>   {
>       char *filename;
>       MemoryRegion *bios, *isa_bios;
> @@ -272,7 +272,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
>       BlockBackend *pflash_blk[ARRAY_SIZE(pcms->flash)];
>   
>       if (!pcmc->pci_enabled) {
> -        old_pc_system_rom_init(rom_memory, true);
> +        x86_bios_rom_init(rom_memory, true);
>           return;
>       }
>   
> @@ -293,7 +293,7 @@ void pc_system_firmware_init(PCMachineState *pcms,
>   
>       if (!pflash_blk[0]) {
>           /* Machine property pflash0 not set, use ROM mode */
> -        old_pc_system_rom_init(rom_memory, false);
> +        x86_bios_rom_init(rom_memory, false);
>       } else {
>           if (kvm_enabled() && !kvm_readonly_mem_enabled()) {
>               /*
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them
  2019-10-04  9:37 ` [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them Sergio Lopez
@ 2019-10-04  9:46   ` Philippe Mathieu-Daudé
  2019-10-04 11:23   ` Stefano Garzarella
  2019-10-04 11:36   ` Stefano Garzarella
  2 siblings, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-10-04  9:46 UTC (permalink / raw)
  To: Sergio Lopez, qemu-devel
  Cc: ehabkost, mst, kraxel, pbonzini, imammedo, sgarzare, lersek, rth

On 10/4/19 11:37 AM, Sergio Lopez wrote:
> Move x86 functions that will be shared between PC and non-PC machine
> types to x86.c, along with their helpers.
> 
> Signed-off-by: Sergio Lopez <slp@redhat.com>

Again:
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> ---
>   include/hw/i386/pc.h  |   1 -
>   include/hw/i386/x86.h |  35 +++
>   hw/i386/pc.c          | 582 +----------------------------------
>   hw/i386/pc_piix.c     |   1 +
>   hw/i386/pc_q35.c      |   1 +
>   hw/i386/pc_sysfw.c    |  54 +---
>   hw/i386/x86.c         | 684 ++++++++++++++++++++++++++++++++++++++++++
>   hw/i386/Makefile.objs |   1 +
>   8 files changed, 724 insertions(+), 635 deletions(-)
>   create mode 100644 include/hw/i386/x86.h
>   create mode 100644 hw/i386/x86.c
> 
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index d12f42e9e5..73e2847e87 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -195,7 +195,6 @@ bool pc_machine_is_smm_enabled(PCMachineState *pcms);
>   void pc_register_ferr_irq(qemu_irq irq);
>   void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
>   
> -void x86_cpus_init(PCMachineState *pcms);
>   void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp);
>   void pc_smp_parse(MachineState *ms, QemuOpts *opts);
>   
> diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
> new file mode 100644
> index 0000000000..71e2b6985d
> --- /dev/null
> +++ b/include/hw/i386/x86.h
> @@ -0,0 +1,35 @@
> +/*
> + * Copyright (c) 2019 Red Hat, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_I386_X86_H
> +#define HW_I386_X86_H
> +
> +#include "hw/boards.h"
> +
> +uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
> +                                    unsigned int cpu_index);
> +void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp);
> +void x86_cpus_init(PCMachineState *pcms);
> +CpuInstanceProperties x86_cpu_index_to_props(MachineState *ms,
> +                                             unsigned cpu_index);
> +int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx);
> +const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
> +
> +void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw);
> +
> +void x86_load_linux(PCMachineState *x86ms, FWCfgState *fw_cfg);
> +
> +#endif
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index fd08c6704b..094db79fb0 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -24,6 +24,7 @@
>   
>   #include "qemu/osdep.h"
>   #include "qemu/units.h"
> +#include "hw/i386/x86.h"
>   #include "hw/i386/pc.h"
>   #include "hw/char/serial.h"
>   #include "hw/char/parallel.h"
> @@ -102,9 +103,6 @@
>   
>   struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
>   
> -/* Physical Address of PVH entry point read from kernel ELF NOTE */
> -static size_t pvh_start_addr;
> -
>   GlobalProperty pc_compat_4_1[] = {};
>   const size_t pc_compat_4_1_len = G_N_ELEMENTS(pc_compat_4_1);
>   
> @@ -866,478 +864,6 @@ static void handle_a20_line_change(void *opaque, int irq, int level)
>       x86_cpu_set_a20(cpu, level);
>   }
>   
> -/* Calculates initial APIC ID for a specific CPU index
> - *
> - * Currently we need to be able to calculate the APIC ID from the CPU index
> - * alone (without requiring a CPU object), as the QEMU<->Seabios interfaces have
> - * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
> - * all CPUs up to max_cpus.
> - */
> -static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
> -                                           unsigned int cpu_index)
> -{
> -    MachineState *ms = MACHINE(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> -    uint32_t correct_id;
> -    static bool warned;
> -
> -    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
> -                                         ms->smp.threads, cpu_index);
> -    if (pcmc->compat_apic_id_mode) {
> -        if (cpu_index != correct_id && !warned && !qtest_enabled()) {
> -            error_report("APIC IDs set in compatibility mode, "
> -                         "CPU topology won't match the configuration");
> -            warned = true;
> -        }
> -        return cpu_index;
> -    } else {
> -        return correct_id;
> -    }
> -}
> -
> -static long get_file_size(FILE *f)
> -{
> -    long where, size;
> -
> -    /* XXX: on Unix systems, using fstat() probably makes more sense */
> -
> -    where = ftell(f);
> -    fseek(f, 0, SEEK_END);
> -    size = ftell(f);
> -    fseek(f, where, SEEK_SET);
> -
> -    return size;
> -}
> -
> -struct setup_data {
> -    uint64_t next;
> -    uint32_t type;
> -    uint32_t len;
> -    uint8_t data[0];
> -} __attribute__((packed));
> -
> -
> -/*
> - * The entry point into the kernel for PVH boot is different from
> - * the native entry point.  The PVH entry is defined by the x86/HVM
> - * direct boot ABI and is available in an ELFNOTE in the kernel binary.
> - *
> - * This function is passed to load_elf() when it is called from
> - * load_elfboot() which then additionally checks for an ELF Note of
> - * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
> - * parse the PVH entry address from the ELF Note.
> - *
> - * Due to trickery in elf_opts.h, load_elf() is actually available as
> - * load_elf32() or load_elf64() and this routine needs to be able
> - * to deal with being called as 32 or 64 bit.
> - *
> - * The address of the PVH entry point is saved to the 'pvh_start_addr'
> - * global variable.  (although the entry point is 32-bit, the kernel
> - * binary can be either 32-bit or 64-bit).
> - */
> -static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
> -{
> -    size_t *elf_note_data_addr;
> -
> -    /* Check if ELF Note header passed in is valid */
> -    if (arg1 == NULL) {
> -        return 0;
> -    }
> -
> -    if (is64) {
> -        struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
> -        uint64_t nhdr_size64 = sizeof(struct elf64_note);
> -        uint64_t phdr_align = *(uint64_t *)arg2;
> -        uint64_t nhdr_namesz = nhdr64->n_namesz;
> -
> -        elf_note_data_addr =
> -            ((void *)nhdr64) + nhdr_size64 +
> -            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
> -    } else {
> -        struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
> -        uint32_t nhdr_size32 = sizeof(struct elf32_note);
> -        uint32_t phdr_align = *(uint32_t *)arg2;
> -        uint32_t nhdr_namesz = nhdr32->n_namesz;
> -
> -        elf_note_data_addr =
> -            ((void *)nhdr32) + nhdr_size32 +
> -            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
> -    }
> -
> -    pvh_start_addr = *elf_note_data_addr;
> -
> -    return pvh_start_addr;
> -}
> -
> -static bool load_elfboot(const char *kernel_filename,
> -                   int kernel_file_size,
> -                   uint8_t *header,
> -                   size_t pvh_xen_start_addr,
> -                   FWCfgState *fw_cfg)
> -{
> -    uint32_t flags = 0;
> -    uint32_t mh_load_addr = 0;
> -    uint32_t elf_kernel_size = 0;
> -    uint64_t elf_entry;
> -    uint64_t elf_low, elf_high;
> -    int kernel_size;
> -
> -    if (ldl_p(header) != 0x464c457f) {
> -        return false; /* no elfboot */
> -    }
> -
> -    bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
> -    flags = elf_is64 ?
> -        ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
> -
> -    if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
> -        error_report("elfboot unsupported flags = %x", flags);
> -        exit(1);
> -    }
> -
> -    uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
> -    kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
> -                           NULL, &elf_note_type, &elf_entry,
> -                           &elf_low, &elf_high, 0, I386_ELF_MACHINE,
> -                           0, 0);
> -
> -    if (kernel_size < 0) {
> -        error_report("Error while loading elf kernel");
> -        exit(1);
> -    }
> -    mh_load_addr = elf_low;
> -    elf_kernel_size = elf_high - elf_low;
> -
> -    if (pvh_start_addr == 0) {
> -        error_report("Error loading uncompressed kernel without PVH ELF Note");
> -        exit(1);
> -    }
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
> -
> -    return true;
> -}
> -
> -static void x86_load_linux(PCMachineState *pcms,
> -                           FWCfgState *fw_cfg)
> -{
> -    uint16_t protocol;
> -    int setup_size, kernel_size, cmdline_size;
> -    int dtb_size, setup_data_offset;
> -    uint32_t initrd_max;
> -    uint8_t header[8192], *setup, *kernel;
> -    hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
> -    FILE *f;
> -    char *vmode;
> -    MachineState *machine = MACHINE(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> -    struct setup_data *setup_data;
> -    const char *kernel_filename = machine->kernel_filename;
> -    const char *initrd_filename = machine->initrd_filename;
> -    const char *dtb_filename = machine->dtb;
> -    const char *kernel_cmdline = machine->kernel_cmdline;
> -
> -    /* Align to 16 bytes as a paranoia measure */
> -    cmdline_size = (strlen(kernel_cmdline)+16) & ~15;
> -
> -    /* load the kernel header */
> -    f = fopen(kernel_filename, "rb");
> -    if (!f || !(kernel_size = get_file_size(f)) ||
> -        fread(header, 1, MIN(ARRAY_SIZE(header), kernel_size), f) !=
> -        MIN(ARRAY_SIZE(header), kernel_size)) {
> -        fprintf(stderr, "qemu: could not load kernel '%s': %s\n",
> -                kernel_filename, strerror(errno));
> -        exit(1);
> -    }
> -
> -    /* kernel protocol version */
> -#if 0
> -    fprintf(stderr, "header magic: %#x\n", ldl_p(header+0x202));
> -#endif
> -    if (ldl_p(header+0x202) == 0x53726448) {
> -        protocol = lduw_p(header+0x206);
> -    } else {
> -        /*
> -         * This could be a multiboot kernel. If it is, let's stop treating it
> -         * like a Linux kernel.
> -         * Note: some multiboot images could be in the ELF format (the same of
> -         * PVH), so we try multiboot first since we check the multiboot magic
> -         * header before to load it.
> -         */
> -        if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
> -                           kernel_cmdline, kernel_size, header)) {
> -            return;
> -        }
> -        /*
> -         * Check if the file is an uncompressed kernel file (ELF) and load it,
> -         * saving the PVH entry point used by the x86/HVM direct boot ABI.
> -         * If load_elfboot() is successful, populate the fw_cfg info.
> -         */
> -        if (pcmc->pvh_enabled &&
> -            load_elfboot(kernel_filename, kernel_size,
> -                         header, pvh_start_addr, fw_cfg)) {
> -            fclose(f);
> -
> -            fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
> -                strlen(kernel_cmdline) + 1);
> -            fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
> -
> -            fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
> -            fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
> -                             header, sizeof(header));
> -
> -            /* load initrd */
> -            if (initrd_filename) {
> -                GMappedFile *mapped_file;
> -                gsize initrd_size;
> -                gchar *initrd_data;
> -                GError *gerr = NULL;
> -
> -                mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
> -                if (!mapped_file) {
> -                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> -                            initrd_filename, gerr->message);
> -                    exit(1);
> -                }
> -                pcms->initrd_mapped_file = mapped_file;
> -
> -                initrd_data = g_mapped_file_get_contents(mapped_file);
> -                initrd_size = g_mapped_file_get_length(mapped_file);
> -                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> -                if (initrd_size >= initrd_max) {
> -                    fprintf(stderr, "qemu: initrd is too large, cannot support."
> -                            "(max: %"PRIu32", need %"PRId64")\n",
> -                            initrd_max, (uint64_t)initrd_size);
> -                    exit(1);
> -                }
> -
> -                initrd_addr = (initrd_max - initrd_size) & ~4095;
> -
> -                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> -                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> -                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
> -                                 initrd_size);
> -            }
> -
> -            option_rom[nb_option_roms].bootindex = 0;
> -            option_rom[nb_option_roms].name = "pvh.bin";
> -            nb_option_roms++;
> -
> -            return;
> -        }
> -        protocol = 0;
> -    }
> -
> -    if (protocol < 0x200 || !(header[0x211] & 0x01)) {
> -        /* Low kernel */
> -        real_addr    = 0x90000;
> -        cmdline_addr = 0x9a000 - cmdline_size;
> -        prot_addr    = 0x10000;
> -    } else if (protocol < 0x202) {
> -        /* High but ancient kernel */
> -        real_addr    = 0x90000;
> -        cmdline_addr = 0x9a000 - cmdline_size;
> -        prot_addr    = 0x100000;
> -    } else {
> -        /* High and recent kernel */
> -        real_addr    = 0x10000;
> -        cmdline_addr = 0x20000;
> -        prot_addr    = 0x100000;
> -    }
> -
> -#if 0
> -    fprintf(stderr,
> -            "qemu: real_addr     = 0x" TARGET_FMT_plx "\n"
> -            "qemu: cmdline_addr  = 0x" TARGET_FMT_plx "\n"
> -            "qemu: prot_addr     = 0x" TARGET_FMT_plx "\n",
> -            real_addr,
> -            cmdline_addr,
> -            prot_addr);
> -#endif
> -
> -    /* highest address for loading the initrd */
> -    if (protocol >= 0x20c &&
> -        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
> -        /*
> -         * Linux has supported initrd up to 4 GB for a very long time (2007,
> -         * long before XLF_CAN_BE_LOADED_ABOVE_4G which was added in 2013),
> -         * though it only sets initrd_max to 2 GB to "work around bootloader
> -         * bugs". Luckily, QEMU firmware(which does something like bootloader)
> -         * has supported this.
> -         *
> -         * It's believed that if XLF_CAN_BE_LOADED_ABOVE_4G is set, initrd can
> -         * be loaded into any address.
> -         *
> -         * In addition, initrd_max is uint32_t simply because QEMU doesn't
> -         * support the 64-bit boot protocol (specifically the ext_ramdisk_image
> -         * field).
> -         *
> -         * Therefore here just limit initrd_max to UINT32_MAX simply as well.
> -         */
> -        initrd_max = UINT32_MAX;
> -    } else if (protocol >= 0x203) {
> -        initrd_max = ldl_p(header+0x22c);
> -    } else {
> -        initrd_max = 0x37ffffff;
> -    }
> -
> -    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
> -        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> -    }
> -
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline)+1);
> -    fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
> -
> -    if (protocol >= 0x202) {
> -        stl_p(header+0x228, cmdline_addr);
> -    } else {
> -        stw_p(header+0x20, 0xA33F);
> -        stw_p(header+0x22, cmdline_addr-real_addr);
> -    }
> -
> -    /* handle vga= parameter */
> -    vmode = strstr(kernel_cmdline, "vga=");
> -    if (vmode) {
> -        unsigned int video_mode;
> -        /* skip "vga=" */
> -        vmode += 4;
> -        if (!strncmp(vmode, "normal", 6)) {
> -            video_mode = 0xffff;
> -        } else if (!strncmp(vmode, "ext", 3)) {
> -            video_mode = 0xfffe;
> -        } else if (!strncmp(vmode, "ask", 3)) {
> -            video_mode = 0xfffd;
> -        } else {
> -            video_mode = strtol(vmode, NULL, 0);
> -        }
> -        stw_p(header+0x1fa, video_mode);
> -    }
> -
> -    /* loader type */
> -    /* High nybble = B reserved for QEMU; low nybble is revision number.
> -       If this code is substantially changed, you may want to consider
> -       incrementing the revision. */
> -    if (protocol >= 0x200) {
> -        header[0x210] = 0xB0;
> -    }
> -    /* heap */
> -    if (protocol >= 0x201) {
> -        header[0x211] |= 0x80;	/* CAN_USE_HEAP */
> -        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
> -    }
> -
> -    /* load initrd */
> -    if (initrd_filename) {
> -        GMappedFile *mapped_file;
> -        gsize initrd_size;
> -        gchar *initrd_data;
> -        GError *gerr = NULL;
> -
> -        if (protocol < 0x200) {
> -            fprintf(stderr, "qemu: linux kernel too old to load a ram disk\n");
> -            exit(1);
> -        }
> -
> -        mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
> -        if (!mapped_file) {
> -            fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> -                    initrd_filename, gerr->message);
> -            exit(1);
> -        }
> -        pcms->initrd_mapped_file = mapped_file;
> -
> -        initrd_data = g_mapped_file_get_contents(mapped_file);
> -        initrd_size = g_mapped_file_get_length(mapped_file);
> -        if (initrd_size >= initrd_max) {
> -            fprintf(stderr, "qemu: initrd is too large, cannot support."
> -                    "(max: %"PRIu32", need %"PRId64")\n",
> -                    initrd_max, (uint64_t)initrd_size);
> -            exit(1);
> -        }
> -
> -        initrd_addr = (initrd_max-initrd_size) & ~4095;
> -
> -        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> -        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> -        fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
> -
> -        stl_p(header+0x218, initrd_addr);
> -        stl_p(header+0x21c, initrd_size);
> -    }
> -
> -    /* load kernel and setup */
> -    setup_size = header[0x1f1];
> -    if (setup_size == 0) {
> -        setup_size = 4;
> -    }
> -    setup_size = (setup_size+1)*512;
> -    if (setup_size > kernel_size) {
> -        fprintf(stderr, "qemu: invalid kernel header\n");
> -        exit(1);
> -    }
> -    kernel_size -= setup_size;
> -
> -    setup  = g_malloc(setup_size);
> -    kernel = g_malloc(kernel_size);
> -    fseek(f, 0, SEEK_SET);
> -    if (fread(setup, 1, setup_size, f) != setup_size) {
> -        fprintf(stderr, "fread() failed\n");
> -        exit(1);
> -    }
> -    if (fread(kernel, 1, kernel_size, f) != kernel_size) {
> -        fprintf(stderr, "fread() failed\n");
> -        exit(1);
> -    }
> -    fclose(f);
> -
> -    /* append dtb to kernel */
> -    if (dtb_filename) {
> -        if (protocol < 0x209) {
> -            fprintf(stderr, "qemu: Linux kernel too old to load a dtb\n");
> -            exit(1);
> -        }
> -
> -        dtb_size = get_image_size(dtb_filename);
> -        if (dtb_size <= 0) {
> -            fprintf(stderr, "qemu: error reading dtb %s: %s\n",
> -                    dtb_filename, strerror(errno));
> -            exit(1);
> -        }
> -
> -        setup_data_offset = QEMU_ALIGN_UP(kernel_size, 16);
> -        kernel_size = setup_data_offset + sizeof(struct setup_data) + dtb_size;
> -        kernel = g_realloc(kernel, kernel_size);
> -
> -        stq_p(header+0x250, prot_addr + setup_data_offset);
> -
> -        setup_data = (struct setup_data *)(kernel + setup_data_offset);
> -        setup_data->next = 0;
> -        setup_data->type = cpu_to_le32(SETUP_DTB);
> -        setup_data->len = cpu_to_le32(dtb_size);
> -
> -        load_image_size(dtb_filename, setup_data->data, dtb_size);
> -    }
> -
> -    memcpy(setup, header, MIN(sizeof(header), setup_size));
> -
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, prot_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, kernel_size);
> -    fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, kernel, kernel_size);
> -
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_ADDR, real_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
> -    fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
> -
> -    option_rom[nb_option_roms].bootindex = 0;
> -    option_rom[nb_option_roms].name = "linuxboot.bin";
> -    if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
> -        option_rom[nb_option_roms].name = "linuxboot_dma.bin";
> -    }
> -    nb_option_roms++;
> -}
> -
>   #define NE2000_NB_MAX 6
>   
>   static const int ne2000_io[NE2000_NB_MAX] = { 0x300, 0x320, 0x340, 0x360,
> @@ -1374,24 +900,6 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
>       }
>   }
>   
> -static void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
> -{
> -    Object *cpu = NULL;
> -    Error *local_err = NULL;
> -    CPUX86State *env = NULL;
> -
> -    cpu = object_new(MACHINE(pcms)->cpu_type);
> -
> -    env = &X86_CPU(cpu)->env;
> -    env->nr_dies = pcms->smp_dies;
> -
> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
> -    object_property_set_bool(cpu, true, "realized", &local_err);
> -
> -    object_unref(cpu);
> -    error_propagate(errp, local_err);
> -}
> -
>   /*
>    * This function is very similar to smp_parse()
>    * in hw/core/machine.c but includes CPU die support.
> @@ -1497,31 +1005,6 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
>       }
>   }
>   
> -void x86_cpus_init(PCMachineState *pcms)
> -{
> -    int i;
> -    const CPUArchIdList *possible_cpus;
> -    MachineState *ms = MACHINE(pcms);
> -    MachineClass *mc = MACHINE_GET_CLASS(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
> -
> -    x86_cpu_set_default_version(pcmc->default_cpu_version);
> -
> -    /* Calculates the limit to CPU APIC ID values
> -     *
> -     * Limit for the APIC ID value, so that all
> -     * CPU APIC IDs are < pcms->apic_id_limit.
> -     *
> -     * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
> -     */
> -    pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
> -                                                     ms->smp.max_cpus - 1) + 1;
> -    possible_cpus = mc->possible_cpu_arch_ids(ms);
> -    for (i = 0; i < ms->smp.cpus; i++) {
> -        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
> -    }
> -}
> -
>   static void rtc_set_cpus_count(ISADevice *rtc, uint16_t cpus_count)
>   {
>       if (cpus_count > 0xff) {
> @@ -2677,69 +2160,6 @@ static void pc_machine_wakeup(MachineState *machine)
>       cpu_synchronize_all_post_reset();
>   }
>   
> -static CpuInstanceProperties
> -x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> -{
> -    MachineClass *mc = MACHINE_GET_CLASS(ms);
> -    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> -
> -    assert(cpu_index < possible_cpus->len);
> -    return possible_cpus->cpus[cpu_index].props;
> -}
> -
> -static int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
> -{
> -   X86CPUTopoInfo topo;
> -   PCMachineState *pcms = PC_MACHINE(ms);
> -
> -   assert(idx < ms->possible_cpus->len);
> -   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
> -                            pcms->smp_dies, ms->smp.cores,
> -                            ms->smp.threads, &topo);
> -   return topo.pkg_id % ms->numa_state->num_nodes;
> -}
> -
> -static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
> -{
> -    PCMachineState *pcms = PC_MACHINE(ms);
> -    int i;
> -    unsigned int max_cpus = ms->smp.max_cpus;
> -
> -    if (ms->possible_cpus) {
> -        /*
> -         * make sure that max_cpus hasn't changed since the first use, i.e.
> -         * -smp hasn't been parsed after it
> -        */
> -        assert(ms->possible_cpus->len == max_cpus);
> -        return ms->possible_cpus;
> -    }
> -
> -    ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
> -                                  sizeof(CPUArchId) * max_cpus);
> -    ms->possible_cpus->len = max_cpus;
> -    for (i = 0; i < ms->possible_cpus->len; i++) {
> -        X86CPUTopoInfo topo;
> -
> -        ms->possible_cpus->cpus[i].type = ms->cpu_type;
> -        ms->possible_cpus->cpus[i].vcpus_count = 1;
> -        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
> -        x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
> -                                 pcms->smp_dies, ms->smp.cores,
> -                                 ms->smp.threads, &topo);
> -        ms->possible_cpus->cpus[i].props.has_socket_id = true;
> -        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
> -        if (pcms->smp_dies > 1) {
> -            ms->possible_cpus->cpus[i].props.has_die_id = true;
> -            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
> -        }
> -        ms->possible_cpus->cpus[i].props.has_core_id = true;
> -        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
> -        ms->possible_cpus->cpus[i].props.has_thread_id = true;
> -        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
> -    }
> -    return ms->possible_cpus;
> -}
> -
>   static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
>   {
>       /* cpu index isn't used */
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index de09e076cd..1396451abf 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -27,6 +27,7 @@
>   
>   #include "qemu/units.h"
>   #include "hw/loader.h"
> +#include "hw/i386/x86.h"
>   #include "hw/i386/pc.h"
>   #include "hw/i386/apic.h"
>   #include "hw/display/ramfb.h"
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 894989b64e..8920bd8978 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -41,6 +41,7 @@
>   #include "hw/pci-host/q35.h"
>   #include "hw/qdev-properties.h"
>   #include "exec/address-spaces.h"
> +#include "hw/i386/x86.h"
>   #include "hw/i386/pc.h"
>   #include "hw/i386/ich9.h"
>   #include "hw/i386/amd_iommu.h"
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index 28cb1f63c9..69b79851be 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -31,6 +31,7 @@
>   #include "qemu/option.h"
>   #include "qemu/units.h"
>   #include "hw/sysbus.h"
> +#include "hw/i386/x86.h"
>   #include "hw/i386/pc.h"
>   #include "hw/loader.h"
>   #include "hw/qdev-properties.h"
> @@ -211,59 +212,6 @@ static void pc_system_flash_map(PCMachineState *pcms,
>       }
>   }
>   
> -static void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
> -{
> -    char *filename;
> -    MemoryRegion *bios, *isa_bios;
> -    int bios_size, isa_bios_size;
> -    int ret;
> -
> -    /* BIOS load */
> -    if (bios_name == NULL) {
> -        bios_name = BIOS_FILENAME;
> -    }
> -    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
> -    if (filename) {
> -        bios_size = get_image_size(filename);
> -    } else {
> -        bios_size = -1;
> -    }
> -    if (bios_size <= 0 ||
> -        (bios_size % 65536) != 0) {
> -        goto bios_error;
> -    }
> -    bios = g_malloc(sizeof(*bios));
> -    memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
> -    if (!isapc_ram_fw) {
> -        memory_region_set_readonly(bios, true);
> -    }
> -    ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
> -    if (ret != 0) {
> -    bios_error:
> -        fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
> -        exit(1);
> -    }
> -    g_free(filename);
> -
> -    /* map the last 128KB of the BIOS in ISA space */
> -    isa_bios_size = MIN(bios_size, 128 * KiB);
> -    isa_bios = g_malloc(sizeof(*isa_bios));
> -    memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
> -                             bios_size - isa_bios_size, isa_bios_size);
> -    memory_region_add_subregion_overlap(rom_memory,
> -                                        0x100000 - isa_bios_size,
> -                                        isa_bios,
> -                                        1);
> -    if (!isapc_ram_fw) {
> -        memory_region_set_readonly(isa_bios, true);
> -    }
> -
> -    /* map all the bios at the top of memory */
> -    memory_region_add_subregion(rom_memory,
> -                                (uint32_t)(-bios_size),
> -                                bios);
> -}
> -
>   void pc_system_firmware_init(PCMachineState *pcms,
>                                MemoryRegion *rom_memory)
>   {
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> new file mode 100644
> index 0000000000..6807bb8a22
> --- /dev/null
> +++ b/hw/i386/x86.c
> @@ -0,0 +1,684 @@
> +/*
> + * Copyright (c) 2003-2004 Fabrice Bellard
> + * Copyright (c) 2019 Red Hat, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +#include "qemu/osdep.h"
> +#include "qemu/error-report.h"
> +#include "qemu/option.h"
> +#include "qemu/cutils.h"
> +#include "qemu/units.h"
> +#include "qemu-common.h"
> +#include "qapi/error.h"
> +#include "qapi/qmp/qerror.h"
> +#include "qapi/qapi-visit-common.h"
> +#include "qapi/visitor.h"
> +#include "sysemu/qtest.h"
> +#include "sysemu/numa.h"
> +#include "sysemu/replay.h"
> +#include "sysemu/sysemu.h"
> +
> +#include "hw/i386/x86.h"
> +#include "hw/i386/pc.h"
> +#include "target/i386/cpu.h"
> +#include "hw/i386/topology.h"
> +#include "hw/i386/fw_cfg.h"
> +
> +#include "hw/acpi/cpu_hotplug.h"
> +#include "hw/nmi.h"
> +#include "hw/loader.h"
> +#include "multiboot.h"
> +#include "elf.h"
> +#include "standard-headers/asm-x86/bootparam.h"
> +
> +#define BIOS_FILENAME "bios.bin"
> +
> +/* Physical Address of PVH entry point read from kernel ELF NOTE */
> +static size_t pvh_start_addr;
> +
> +/* Calculates initial APIC ID for a specific CPU index
> + *
> + * Currently we need to be able to calculate the APIC ID from the CPU index
> + * alone (without requiring a CPU object), as the QEMU<->Seabios interfaces have
> + * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
> + * all CPUs up to max_cpus.
> + */
> +uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
> +                                    unsigned int cpu_index)
> +{
> +    MachineState *ms = MACHINE(pcms);
> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    uint32_t correct_id;
> +    static bool warned;
> +
> +    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
> +                                         ms->smp.threads, cpu_index);
> +    if (pcmc->compat_apic_id_mode) {
> +        if (cpu_index != correct_id && !warned && !qtest_enabled()) {
> +            error_report("APIC IDs set in compatibility mode, "
> +                         "CPU topology won't match the configuration");
> +            warned = true;
> +        }
> +        return cpu_index;
> +    } else {
> +        return correct_id;
> +    }
> +}
> +
> +void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
> +{
> +    Object *cpu = NULL;
> +    Error *local_err = NULL;
> +    CPUX86State *env = NULL;
> +
> +    cpu = object_new(MACHINE(pcms)->cpu_type);
> +
> +    env = &X86_CPU(cpu)->env;
> +    env->nr_dies = pcms->smp_dies;
> +
> +    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
> +    object_property_set_bool(cpu, true, "realized", &local_err);
> +
> +    object_unref(cpu);
> +    error_propagate(errp, local_err);
> +}
> +
> +void x86_cpus_init(PCMachineState *pcms)
> +{
> +    int i;
> +    const CPUArchIdList *possible_cpus;
> +    MachineState *ms = MACHINE(pcms);
> +    MachineClass *mc = MACHINE_GET_CLASS(pcms);
> +    PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
> +
> +    x86_cpu_set_default_version(pcmc->default_cpu_version);
> +
> +    /* Calculates the limit to CPU APIC ID values
> +     *
> +     * Limit for the APIC ID value, so that all
> +     * CPU APIC IDs are < pcms->apic_id_limit.
> +     *
> +     * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
> +     */
> +    pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
> +                                                     ms->smp.max_cpus - 1) + 1;
> +    possible_cpus = mc->possible_cpu_arch_ids(ms);
> +    for (i = 0; i < ms->smp.cpus; i++) {
> +        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
> +    }
> +}
> +
> +CpuInstanceProperties
> +x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> +
> +    assert(cpu_index < possible_cpus->len);
> +    return possible_cpus->cpus[cpu_index].props;
> +}
> +
> +int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
> +{
> +   X86CPUTopoInfo topo;
> +   PCMachineState *pcms = PC_MACHINE(ms);
> +
> +   assert(idx < ms->possible_cpus->len);
> +   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
> +                            pcms->smp_dies, ms->smp.cores,
> +                            ms->smp.threads, &topo);
> +   return topo.pkg_id % ms->numa_state->num_nodes;
> +}
> +
> +const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
> +{
> +    PCMachineState *pcms = PC_MACHINE(ms);
> +    int i;
> +    unsigned int max_cpus = ms->smp.max_cpus;
> +
> +    if (ms->possible_cpus) {
> +        /*
> +         * make sure that max_cpus hasn't changed since the first use, i.e.
> +         * -smp hasn't been parsed after it
> +        */
> +        assert(ms->possible_cpus->len == max_cpus);
> +        return ms->possible_cpus;
> +    }
> +
> +    ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
> +                                  sizeof(CPUArchId) * max_cpus);
> +    ms->possible_cpus->len = max_cpus;
> +    for (i = 0; i < ms->possible_cpus->len; i++) {
> +        X86CPUTopoInfo topo;
> +
> +        ms->possible_cpus->cpus[i].type = ms->cpu_type;
> +        ms->possible_cpus->cpus[i].vcpus_count = 1;
> +        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
> +        x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
> +                                 pcms->smp_dies, ms->smp.cores,
> +                                 ms->smp.threads, &topo);
> +        ms->possible_cpus->cpus[i].props.has_socket_id = true;
> +        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
> +        if (pcms->smp_dies > 1) {
> +            ms->possible_cpus->cpus[i].props.has_die_id = true;
> +            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
> +        }
> +        ms->possible_cpus->cpus[i].props.has_core_id = true;
> +        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
> +        ms->possible_cpus->cpus[i].props.has_thread_id = true;
> +        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
> +    }
> +    return ms->possible_cpus;
> +}
> +
> +static long get_file_size(FILE *f)
> +{
> +    long where, size;
> +
> +    /* XXX: on Unix systems, using fstat() probably makes more sense */
> +
> +    where = ftell(f);
> +    fseek(f, 0, SEEK_END);
> +    size = ftell(f);
> +    fseek(f, where, SEEK_SET);
> +
> +    return size;
> +}
> +
> +struct setup_data {
> +    uint64_t next;
> +    uint32_t type;
> +    uint32_t len;
> +    uint8_t data[0];
> +} __attribute__((packed));
> +
> +/*
> + * The entry point into the kernel for PVH boot is different from
> + * the native entry point.  The PVH entry is defined by the x86/HVM
> + * direct boot ABI and is available in an ELFNOTE in the kernel binary.
> + *
> + * This function is passed to load_elf() when it is called from
> + * load_elfboot() which then additionally checks for an ELF Note of
> + * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
> + * parse the PVH entry address from the ELF Note.
> + *
> + * Due to trickery in elf_opts.h, load_elf() is actually available as
> + * load_elf32() or load_elf64() and this routine needs to be able
> + * to deal with being called as 32 or 64 bit.
> + *
> + * The address of the PVH entry point is saved to the 'pvh_start_addr'
> + * global variable.  (although the entry point is 32-bit, the kernel
> + * binary can be either 32-bit or 64-bit).
> + */
> +static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
> +{
> +    size_t *elf_note_data_addr;
> +
> +    /* Check if ELF Note header passed in is valid */
> +    if (arg1 == NULL) {
> +        return 0;
> +    }
> +
> +    if (is64) {
> +        struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
> +        uint64_t nhdr_size64 = sizeof(struct elf64_note);
> +        uint64_t phdr_align = *(uint64_t *)arg2;
> +        uint64_t nhdr_namesz = nhdr64->n_namesz;
> +
> +        elf_note_data_addr =
> +            ((void *)nhdr64) + nhdr_size64 +
> +            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
> +    } else {
> +        struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
> +        uint32_t nhdr_size32 = sizeof(struct elf32_note);
> +        uint32_t phdr_align = *(uint32_t *)arg2;
> +        uint32_t nhdr_namesz = nhdr32->n_namesz;
> +
> +        elf_note_data_addr =
> +            ((void *)nhdr32) + nhdr_size32 +
> +            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
> +    }
> +
> +    pvh_start_addr = *elf_note_data_addr;
> +
> +    return pvh_start_addr;
> +}
> +
> +static bool load_elfboot(const char *kernel_filename,
> +                   int kernel_file_size,
> +                   uint8_t *header,
> +                   size_t pvh_xen_start_addr,
> +                   FWCfgState *fw_cfg)
> +{
> +    uint32_t flags = 0;
> +    uint32_t mh_load_addr = 0;
> +    uint32_t elf_kernel_size = 0;
> +    uint64_t elf_entry;
> +    uint64_t elf_low, elf_high;
> +    int kernel_size;
> +
> +    if (ldl_p(header) != 0x464c457f) {
> +        return false; /* no elfboot */
> +    }
> +
> +    bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
> +    flags = elf_is64 ?
> +        ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
> +
> +    if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
> +        error_report("elfboot unsupported flags = %x", flags);
> +        exit(1);
> +    }
> +
> +    uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
> +    kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
> +                           NULL, &elf_note_type, &elf_entry,
> +                           &elf_low, &elf_high, 0, I386_ELF_MACHINE,
> +                           0, 0);
> +
> +    if (kernel_size < 0) {
> +        error_report("Error while loading elf kernel");
> +        exit(1);
> +    }
> +    mh_load_addr = elf_low;
> +    elf_kernel_size = elf_high - elf_low;
> +
> +    if (pvh_start_addr == 0) {
> +        error_report("Error loading uncompressed kernel without PVH ELF Note");
> +        exit(1);
> +    }
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
> +
> +    return true;
> +}
> +
> +void x86_load_linux(PCMachineState *pcms,
> +                    FWCfgState *fw_cfg)
> +{
> +    uint16_t protocol;
> +    int setup_size, kernel_size, cmdline_size;
> +    int dtb_size, setup_data_offset;
> +    uint32_t initrd_max;
> +    uint8_t header[8192], *setup, *kernel;
> +    hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
> +    FILE *f;
> +    char *vmode;
> +    MachineState *machine = MACHINE(pcms);
> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    struct setup_data *setup_data;
> +    const char *kernel_filename = machine->kernel_filename;
> +    const char *initrd_filename = machine->initrd_filename;
> +    const char *dtb_filename = machine->dtb;
> +    const char *kernel_cmdline = machine->kernel_cmdline;
> +
> +    /* Align to 16 bytes as a paranoia measure */
> +    cmdline_size = (strlen(kernel_cmdline)+16) & ~15;
> +
> +    /* load the kernel header */
> +    f = fopen(kernel_filename, "rb");
> +    if (!f || !(kernel_size = get_file_size(f)) ||
> +        fread(header, 1, MIN(ARRAY_SIZE(header), kernel_size), f) !=
> +        MIN(ARRAY_SIZE(header), kernel_size)) {
> +        fprintf(stderr, "qemu: could not load kernel '%s': %s\n",
> +                kernel_filename, strerror(errno));
> +        exit(1);
> +    }
> +
> +    /* kernel protocol version */
> +#if 0
> +    fprintf(stderr, "header magic: %#x\n", ldl_p(header+0x202));
> +#endif
> +    if (ldl_p(header+0x202) == 0x53726448) {
> +        protocol = lduw_p(header+0x206);
> +    } else {
> +        /*
> +         * This could be a multiboot kernel. If it is, let's stop treating it
> +         * like a Linux kernel.
> +         * Note: some multiboot images could be in the ELF format (the same of
> +         * PVH), so we try multiboot first since we check the multiboot magic
> +         * header before to load it.
> +         */
> +        if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
> +                           kernel_cmdline, kernel_size, header)) {
> +            return;
> +        }
> +        /*
> +         * Check if the file is an uncompressed kernel file (ELF) and load it,
> +         * saving the PVH entry point used by the x86/HVM direct boot ABI.
> +         * If load_elfboot() is successful, populate the fw_cfg info.
> +         */
> +        if (pcmc->pvh_enabled &&
> +            load_elfboot(kernel_filename, kernel_size,
> +                         header, pvh_start_addr, fw_cfg)) {
> +            fclose(f);
> +
> +            fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
> +                strlen(kernel_cmdline) + 1);
> +            fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
> +
> +            fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
> +            fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
> +                             header, sizeof(header));
> +
> +            /* load initrd */
> +            if (initrd_filename) {
> +                GMappedFile *mapped_file;
> +                gsize initrd_size;
> +                gchar *initrd_data;
> +                GError *gerr = NULL;
> +
> +                mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
> +                if (!mapped_file) {
> +                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> +                            initrd_filename, gerr->message);
> +                    exit(1);
> +                }
> +                pcms->initrd_mapped_file = mapped_file;
> +
> +                initrd_data = g_mapped_file_get_contents(mapped_file);
> +                initrd_size = g_mapped_file_get_length(mapped_file);
> +                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> +                if (initrd_size >= initrd_max) {
> +                    fprintf(stderr, "qemu: initrd is too large, cannot support."
> +                            "(max: %"PRIu32", need %"PRId64")\n",
> +                            initrd_max, (uint64_t)initrd_size);
> +                    exit(1);
> +                }
> +
> +                initrd_addr = (initrd_max - initrd_size) & ~4095;
> +
> +                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> +                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> +                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
> +                                 initrd_size);
> +            }
> +
> +            option_rom[nb_option_roms].bootindex = 0;
> +            option_rom[nb_option_roms].name = "pvh.bin";
> +            nb_option_roms++;
> +
> +            return;
> +        }
> +        protocol = 0;
> +    }
> +
> +    if (protocol < 0x200 || !(header[0x211] & 0x01)) {
> +        /* Low kernel */
> +        real_addr    = 0x90000;
> +        cmdline_addr = 0x9a000 - cmdline_size;
> +        prot_addr    = 0x10000;
> +    } else if (protocol < 0x202) {
> +        /* High but ancient kernel */
> +        real_addr    = 0x90000;
> +        cmdline_addr = 0x9a000 - cmdline_size;
> +        prot_addr    = 0x100000;
> +    } else {
> +        /* High and recent kernel */
> +        real_addr    = 0x10000;
> +        cmdline_addr = 0x20000;
> +        prot_addr    = 0x100000;
> +    }
> +
> +#if 0
> +    fprintf(stderr,
> +            "qemu: real_addr     = 0x" TARGET_FMT_plx "\n"
> +            "qemu: cmdline_addr  = 0x" TARGET_FMT_plx "\n"
> +            "qemu: prot_addr     = 0x" TARGET_FMT_plx "\n",
> +            real_addr,
> +            cmdline_addr,
> +            prot_addr);
> +#endif
> +
> +    /* highest address for loading the initrd */
> +    if (protocol >= 0x20c &&
> +        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
> +        /*
> +         * Linux has supported initrd up to 4 GB for a very long time (2007,
> +         * long before XLF_CAN_BE_LOADED_ABOVE_4G which was added in 2013),
> +         * though it only sets initrd_max to 2 GB to "work around bootloader
> +         * bugs". Luckily, QEMU firmware(which does something like bootloader)
> +         * has supported this.
> +         *
> +         * It's believed that if XLF_CAN_BE_LOADED_ABOVE_4G is set, initrd can
> +         * be loaded into any address.
> +         *
> +         * In addition, initrd_max is uint32_t simply because QEMU doesn't
> +         * support the 64-bit boot protocol (specifically the ext_ramdisk_image
> +         * field).
> +         *
> +         * Therefore here just limit initrd_max to UINT32_MAX simply as well.
> +         */
> +        initrd_max = UINT32_MAX;
> +    } else if (protocol >= 0x203) {
> +        initrd_max = ldl_p(header+0x22c);
> +    } else {
> +        initrd_max = 0x37ffffff;
> +    }
> +
> +    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
> +        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> +    }
> +
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline)+1);
> +    fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
> +
> +    if (protocol >= 0x202) {
> +        stl_p(header+0x228, cmdline_addr);
> +    } else {
> +        stw_p(header+0x20, 0xA33F);
> +        stw_p(header+0x22, cmdline_addr-real_addr);
> +    }
> +
> +    /* handle vga= parameter */
> +    vmode = strstr(kernel_cmdline, "vga=");
> +    if (vmode) {
> +        unsigned int video_mode;
> +        /* skip "vga=" */
> +        vmode += 4;
> +        if (!strncmp(vmode, "normal", 6)) {
> +            video_mode = 0xffff;
> +        } else if (!strncmp(vmode, "ext", 3)) {
> +            video_mode = 0xfffe;
> +        } else if (!strncmp(vmode, "ask", 3)) {
> +            video_mode = 0xfffd;
> +        } else {
> +            video_mode = strtol(vmode, NULL, 0);
> +        }
> +        stw_p(header+0x1fa, video_mode);
> +    }
> +
> +    /* loader type */
> +    /* High nybble = B reserved for QEMU; low nybble is revision number.
> +       If this code is substantially changed, you may want to consider
> +       incrementing the revision. */
> +    if (protocol >= 0x200) {
> +        header[0x210] = 0xB0;
> +    }
> +    /* heap */
> +    if (protocol >= 0x201) {
> +        header[0x211] |= 0x80;	/* CAN_USE_HEAP */
> +        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
> +    }
> +
> +    /* load initrd */
> +    if (initrd_filename) {
> +        GMappedFile *mapped_file;
> +        gsize initrd_size;
> +        gchar *initrd_data;
> +        GError *gerr = NULL;
> +
> +        if (protocol < 0x200) {
> +            fprintf(stderr, "qemu: linux kernel too old to load a ram disk\n");
> +            exit(1);
> +        }
> +
> +        mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
> +        if (!mapped_file) {
> +            fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> +                    initrd_filename, gerr->message);
> +            exit(1);
> +        }
> +        pcms->initrd_mapped_file = mapped_file;
> +
> +        initrd_data = g_mapped_file_get_contents(mapped_file);
> +        initrd_size = g_mapped_file_get_length(mapped_file);
> +        if (initrd_size >= initrd_max) {
> +            fprintf(stderr, "qemu: initrd is too large, cannot support."
> +                    "(max: %"PRIu32", need %"PRId64")\n",
> +                    initrd_max, (uint64_t)initrd_size);
> +            exit(1);
> +        }
> +
> +        initrd_addr = (initrd_max-initrd_size) & ~4095;
> +
> +        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> +        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> +        fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
> +
> +        stl_p(header+0x218, initrd_addr);
> +        stl_p(header+0x21c, initrd_size);
> +    }
> +
> +    /* load kernel and setup */
> +    setup_size = header[0x1f1];
> +    if (setup_size == 0) {
> +        setup_size = 4;
> +    }
> +    setup_size = (setup_size+1)*512;
> +    if (setup_size > kernel_size) {
> +        fprintf(stderr, "qemu: invalid kernel header\n");
> +        exit(1);
> +    }
> +    kernel_size -= setup_size;
> +
> +    setup  = g_malloc(setup_size);
> +    kernel = g_malloc(kernel_size);
> +    fseek(f, 0, SEEK_SET);
> +    if (fread(setup, 1, setup_size, f) != setup_size) {
> +        fprintf(stderr, "fread() failed\n");
> +        exit(1);
> +    }
> +    if (fread(kernel, 1, kernel_size, f) != kernel_size) {
> +        fprintf(stderr, "fread() failed\n");
> +        exit(1);
> +    }
> +    fclose(f);
> +
> +    /* append dtb to kernel */
> +    if (dtb_filename) {
> +        if (protocol < 0x209) {
> +            fprintf(stderr, "qemu: Linux kernel too old to load a dtb\n");
> +            exit(1);
> +        }
> +
> +        dtb_size = get_image_size(dtb_filename);
> +        if (dtb_size <= 0) {
> +            fprintf(stderr, "qemu: error reading dtb %s: %s\n",
> +                    dtb_filename, strerror(errno));
> +            exit(1);
> +        }
> +
> +        setup_data_offset = QEMU_ALIGN_UP(kernel_size, 16);
> +        kernel_size = setup_data_offset + sizeof(struct setup_data) + dtb_size;
> +        kernel = g_realloc(kernel, kernel_size);
> +
> +        stq_p(header+0x250, prot_addr + setup_data_offset);
> +
> +        setup_data = (struct setup_data *)(kernel + setup_data_offset);
> +        setup_data->next = 0;
> +        setup_data->type = cpu_to_le32(SETUP_DTB);
> +        setup_data->len = cpu_to_le32(dtb_size);
> +
> +        load_image_size(dtb_filename, setup_data->data, dtb_size);
> +    }
> +
> +    memcpy(setup, header, MIN(sizeof(header), setup_size));
> +
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, prot_addr);
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, kernel_size);
> +    fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, kernel, kernel_size);
> +
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_ADDR, real_addr);
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
> +    fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
> +
> +    option_rom[nb_option_roms].bootindex = 0;
> +    option_rom[nb_option_roms].name = "linuxboot.bin";
> +    if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
> +        option_rom[nb_option_roms].name = "linuxboot_dma.bin";
> +    }
> +    nb_option_roms++;
> +}
> +
> +void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
> +{
> +    char *filename;
> +    MemoryRegion *bios, *isa_bios;
> +    int bios_size, isa_bios_size;
> +    int ret;
> +
> +    /* BIOS load */
> +    if (bios_name == NULL) {
> +        bios_name = BIOS_FILENAME;
> +    }
> +    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
> +    if (filename) {
> +        bios_size = get_image_size(filename);
> +    } else {
> +        bios_size = -1;
> +    }
> +    if (bios_size <= 0 ||
> +        (bios_size % 65536) != 0) {
> +        goto bios_error;
> +    }
> +    bios = g_malloc(sizeof(*bios));
> +    memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
> +    if (!isapc_ram_fw) {
> +        memory_region_set_readonly(bios, true);
> +    }
> +    ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
> +    if (ret != 0) {
> +    bios_error:
> +        fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
> +        exit(1);
> +    }
> +    g_free(filename);
> +
> +    /* map the last 128KB of the BIOS in ISA space */
> +    isa_bios_size = MIN(bios_size, 128 * KiB);
> +    isa_bios = g_malloc(sizeof(*isa_bios));
> +    memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
> +                             bios_size - isa_bios_size, isa_bios_size);
> +    memory_region_add_subregion_overlap(rom_memory,
> +                                        0x100000 - isa_bios_size,
> +                                        isa_bios,
> +                                        1);
> +    if (!isapc_ram_fw) {
> +        memory_region_set_readonly(isa_bios, true);
> +    }
> +
> +    /* map all the bios at the top of memory */
> +    memory_region_add_subregion(rom_memory,
> +                                (uint32_t)(-bios_size),
> +                                bios);
> +}
> diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
> index d3374e0831..7ed80a4853 100644
> --- a/hw/i386/Makefile.objs
> +++ b/hw/i386/Makefile.objs
> @@ -1,5 +1,6 @@
>   obj-$(CONFIG_KVM) += kvm/
>   obj-y += e820_memory_layout.o multiboot.o
> +obj-y += x86.o
>   obj-y += pc.o
>   obj-$(CONFIG_I440FX) += pc_piix.o
>   obj-$(CONFIG_Q35) += pc_q35.o
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 04/10] hw/i386: split PCMachineState deriving X86MachineState from it
  2019-10-04  9:37 ` [PATCH v6 04/10] hw/i386: split PCMachineState deriving X86MachineState from it Sergio Lopez
@ 2019-10-04  9:49   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-10-04  9:49 UTC (permalink / raw)
  To: Sergio Lopez, qemu-devel
  Cc: ehabkost, mst, kraxel, pbonzini, imammedo, sgarzare, lersek, rth

On 10/4/19 11:37 AM, Sergio Lopez wrote:
> Split up PCMachineState and PCMachineClass and derive X86MachineState
> and X86MachineClass from them. This allows sharing code with non-PC
> x86 machine types.
> 
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
>   include/hw/i386/pc.h  |  27 +------
>   include/hw/i386/x86.h |  56 ++++++++++++-

Thanks for using the git.orderfile script :)

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>

>   hw/acpi/cpu_hotplug.c |  10 +--
>   hw/i386/acpi-build.c  |  29 ++++---
>   hw/i386/amd_iommu.c   |   3 +-
>   hw/i386/intel_iommu.c |   3 +-
>   hw/i386/pc.c          | 178 ++++++++++++++----------------------------
>   hw/i386/pc_piix.c     |  43 +++++-----
>   hw/i386/pc_q35.c      |  35 +++++----
>   hw/i386/x86.c         | 139 +++++++++++++++++++++++++++++----
>   hw/i386/xen/xen-hvm.c |  23 +++---
>   hw/intc/ioapic.c      |   2 +-
>   12 files changed, 320 insertions(+), 228 deletions(-)
> 
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index 73e2847e87..d2a690d05e 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -8,6 +8,7 @@
>   #include "hw/block/flash.h"
>   #include "net/net.h"
>   #include "hw/i386/ioapic.h"
> +#include "hw/i386/x86.h"
>   
>   #include "qemu/range.h"
>   #include "qemu/bitmap.h"
> @@ -27,7 +28,7 @@
>    */
>   struct PCMachineState {
>       /*< private >*/
> -    MachineState parent_obj;
> +    X86MachineState parent_obj;
>   
>       /* <public> */
>   
> @@ -36,16 +37,11 @@ struct PCMachineState {
>   
>       /* Pointers to devices and objects: */
>       HotplugHandler *acpi_dev;
> -    ISADevice *rtc;
>       PCIBus *bus;
>       I2CBus *smbus;
> -    FWCfgState *fw_cfg;
> -    qemu_irq *gsi;
>       PFlashCFI01 *flash[2];
> -    GMappedFile *initrd_mapped_file;
>   
>       /* Configuration options: */
> -    uint64_t max_ram_below_4g;
>       OnOffAuto vmport;
>       OnOffAuto smm;
>   
> @@ -54,27 +50,13 @@ struct PCMachineState {
>       bool sata_enabled;
>       bool pit_enabled;
>   
> -    /* RAM information (sizes, addresses, configuration): */
> -    ram_addr_t below_4g_mem_size, above_4g_mem_size;
> -
> -    /* CPU and apic information: */
> -    bool apic_xrupt_override;
> -    unsigned apic_id_limit;
> -    uint16_t boot_cpus;
> -    unsigned smp_dies;
> -
>       /* NUMA information: */
>       uint64_t numa_nodes;
>       uint64_t *node_mem;
> -
> -    /* Address space used by IOAPIC device. All IOAPIC interrupts
> -     * will be translated to MSI messages in the address space. */
> -    AddressSpace *ioapic_as;
>   };
>   
>   #define PC_MACHINE_ACPI_DEVICE_PROP "acpi-device"
>   #define PC_MACHINE_DEVMEM_REGION_SIZE "device-memory-region-size"
> -#define PC_MACHINE_MAX_RAM_BELOW_4G "max-ram-below-4g"
>   #define PC_MACHINE_VMPORT           "vmport"
>   #define PC_MACHINE_SMM              "smm"
>   #define PC_MACHINE_SMBUS            "smbus"
> @@ -99,7 +81,7 @@ struct PCMachineState {
>    */
>   typedef struct PCMachineClass {
>       /*< private >*/
> -    MachineClass parent_class;
> +    X86MachineClass parent_class;
>   
>       /*< public >*/
>   
> @@ -141,9 +123,6 @@ typedef struct PCMachineClass {
>   
>       /* use PVH to load kernels that support this feature */
>       bool pvh_enabled;
> -
> -    /* Enables contiguous-apic-ID mode */
> -    bool compat_apic_id_mode;
>   } PCMachineClass;
>   
>   #define TYPE_PC_MACHINE "generic-pc-machine"
> diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
> index 71e2b6985d..a930a7ad9d 100644
> --- a/include/hw/i386/x86.h
> +++ b/include/hw/i386/x86.h
> @@ -17,7 +17,61 @@
>   #ifndef HW_I386_X86_H
>   #define HW_I386_X86_H
>   
> +#include "qemu-common.h"
> +#include "exec/hwaddr.h"
> +#include "qemu/notify.h"
> +
>   #include "hw/boards.h"
> +#include "hw/nmi.h"
> +
> +typedef struct {
> +    /*< private >*/
> +    MachineClass parent;
> +
> +    /*< public >*/
> +
> +    /* Enables contiguous-apic-ID mode */
> +    bool compat_apic_id_mode;
> +} X86MachineClass;
> +
> +typedef struct {
> +    /*< private >*/
> +    MachineState parent;
> +
> +    /*< public >*/
> +
> +    /* Pointers to devices and objects: */
> +    ISADevice *rtc;
> +    FWCfgState *fw_cfg;
> +    qemu_irq *gsi;
> +    GMappedFile *initrd_mapped_file;
> +
> +    /* Configuration options: */
> +    uint64_t max_ram_below_4g;
> +
> +    /* RAM information (sizes, addresses, configuration): */
> +    ram_addr_t below_4g_mem_size, above_4g_mem_size;
> +
> +    /* CPU and apic information: */
> +    bool apic_xrupt_override;
> +    unsigned apic_id_limit;
> +    uint16_t boot_cpus;
> +    unsigned smp_dies;
> +
> +    /* Address space used by IOAPIC device. All IOAPIC interrupts
> +     * will be translated to MSI messages in the address space. */
> +    AddressSpace *ioapic_as;
> +} X86MachineState;
> +
> +#define X86_MACHINE_MAX_RAM_BELOW_4G "max-ram-below-4g"
> +
> +#define TYPE_X86_MACHINE   MACHINE_TYPE_NAME("x86")
> +#define X86_MACHINE(obj) \
> +    OBJECT_CHECK(X86MachineState, (obj), TYPE_X86_MACHINE)
> +#define X86_MACHINE_GET_CLASS(obj) \
> +    OBJECT_GET_CLASS(X86MachineClass, obj, TYPE_X86_MACHINE)
> +#define X86_MACHINE_CLASS(class) \
> +    OBJECT_CLASS_CHECK(X86MachineClass, class, TYPE_X86_MACHINE)
>   
>   uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
>                                       unsigned int cpu_index);
> @@ -30,6 +84,6 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
>   
>   void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw);
>   
> -void x86_load_linux(PCMachineState *x86ms, FWCfgState *fw_cfg);
> +void x86_load_linux(PCMachineState *pcms, FWCfgState *fw_cfg);
>   
>   #endif
> diff --git a/hw/acpi/cpu_hotplug.c b/hw/acpi/cpu_hotplug.c
> index 6e8293aac9..3ac2045a95 100644
> --- a/hw/acpi/cpu_hotplug.c
> +++ b/hw/acpi/cpu_hotplug.c
> @@ -128,7 +128,7 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
>       Aml *one = aml_int(1);
>       MachineClass *mc = MACHINE_GET_CLASS(machine);
>       const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
> -    PCMachineState *pcms = PC_MACHINE(machine);
> +    X86MachineState *x86ms = X86_MACHINE(machine);
>   
>       /*
>        * _MAT method - creates an madt apic buffer
> @@ -236,9 +236,9 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
>       /* The current AML generator can cover the APIC ID range [0..255],
>        * inclusive, for VCPU hotplug. */
>       QEMU_BUILD_BUG_ON(ACPI_CPU_HOTPLUG_ID_LIMIT > 256);
> -    if (pcms->apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) {
> +    if (x86ms->apic_id_limit > ACPI_CPU_HOTPLUG_ID_LIMIT) {
>           error_report("max_cpus is too large. APIC ID of last CPU is %u",
> -                     pcms->apic_id_limit - 1);
> +                     x86ms->apic_id_limit - 1);
>           exit(1);
>       }
>   
> @@ -315,8 +315,8 @@ void build_legacy_cpu_hotplug_aml(Aml *ctx, MachineState *machine,
>        * ith up to 255 elements. Windows guests up to win2k8 fail when
>        * VarPackageOp is used.
>        */
> -    pkg = pcms->apic_id_limit <= 255 ? aml_package(pcms->apic_id_limit) :
> -                                       aml_varpackage(pcms->apic_id_limit);
> +    pkg = x86ms->apic_id_limit <= 255 ? aml_package(x86ms->apic_id_limit) :
> +                                        aml_varpackage(x86ms->apic_id_limit);
>   
>       for (i = 0, apic_idx = 0; i < apic_ids->len; i++) {
>           int apic_id = apic_ids->cpus[i].arch_id;
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index 4e0f9f425a..fc7de46533 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -361,6 +361,7 @@ static void
>   build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms)
>   {
>       MachineClass *mc = MACHINE_GET_CLASS(pcms);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>       const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(MACHINE(pcms));
>       int madt_start = table_data->len;
>       AcpiDeviceIfClass *adevc = ACPI_DEVICE_IF_GET_CLASS(pcms->acpi_dev);
> @@ -390,7 +391,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, PCMachineState *pcms)
>       io_apic->address = cpu_to_le32(IO_APIC_DEFAULT_ADDRESS);
>       io_apic->interrupt = cpu_to_le32(0);
>   
> -    if (pcms->apic_xrupt_override) {
> +    if (x86ms->apic_xrupt_override) {
>           intsrcovr = acpi_data_push(table_data, sizeof *intsrcovr);
>           intsrcovr->type   = ACPI_APIC_XRUPT_OVERRIDE;
>           intsrcovr->length = sizeof(*intsrcovr);
> @@ -1831,6 +1832,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>       CrsRangeSet crs_range_set;
>       PCMachineState *pcms = PC_MACHINE(machine);
>       PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(machine);
> +    X86MachineState *x86ms = X86_MACHINE(machine);
>       AcpiMcfgInfo mcfg;
>       uint32_t nr_mem = machine->ram_slots;
>       int root_bus_limit = 0xFF;
> @@ -2098,7 +2100,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>            * with half of the 16-bit control register. Hence, the total size
>            * of the i/o region used is FW_CFG_CTL_SIZE; when using DMA, the
>            * DMA control register is located at FW_CFG_DMA_IO_BASE + 4 */
> -        uint8_t io_size = object_property_get_bool(OBJECT(pcms->fw_cfg),
> +        uint8_t io_size = object_property_get_bool(OBJECT(x86ms->fw_cfg),
>                                                      "dma_enabled", NULL) ?
>                             ROUND_UP(FW_CFG_CTL_SIZE, 4) + sizeof(dma_addr_t) :
>                             FW_CFG_CTL_SIZE;
> @@ -2331,6 +2333,7 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
>       int srat_start, numa_start, slots;
>       uint64_t mem_len, mem_base, next_base;
>       MachineClass *mc = MACHINE_GET_CLASS(machine);
> +    X86MachineState *x86ms = X86_MACHINE(machine);
>       const CPUArchIdList *apic_ids = mc->possible_cpu_arch_ids(machine);
>       PCMachineState *pcms = PC_MACHINE(machine);
>       ram_addr_t hotplugabble_address_space_size =
> @@ -2401,16 +2404,16 @@ build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine)
>           }
>   
>           /* Cut out the ACPI_PCI hole */
> -        if (mem_base <= pcms->below_4g_mem_size &&
> -            next_base > pcms->below_4g_mem_size) {
> -            mem_len -= next_base - pcms->below_4g_mem_size;
> +        if (mem_base <= x86ms->below_4g_mem_size &&
> +            next_base > x86ms->below_4g_mem_size) {
> +            mem_len -= next_base - x86ms->below_4g_mem_size;
>               if (mem_len > 0) {
>                   numamem = acpi_data_push(table_data, sizeof *numamem);
>                   build_srat_memory(numamem, mem_base, mem_len, i - 1,
>                                     MEM_AFFINITY_ENABLED);
>               }
>               mem_base = 1ULL << 32;
> -            mem_len = next_base - pcms->below_4g_mem_size;
> +            mem_len = next_base - x86ms->below_4g_mem_size;
>               next_base = mem_base + mem_len;
>           }
>   
> @@ -2629,6 +2632,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>   {
>       PCMachineState *pcms = PC_MACHINE(machine);
>       PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    X86MachineState *x86ms = X86_MACHINE(machine);
>       GArray *table_offsets;
>       unsigned facs, dsdt, rsdt, fadt;
>       AcpiPmInfo pm;
> @@ -2790,7 +2794,7 @@ void acpi_build(AcpiBuildTables *tables, MachineState *machine)
>            */
>           int legacy_aml_len =
>               pcmc->legacy_acpi_table_size +
> -            ACPI_BUILD_LEGACY_CPU_AML_SIZE * pcms->apic_id_limit;
> +            ACPI_BUILD_LEGACY_CPU_AML_SIZE * x86ms->apic_id_limit;
>           int legacy_table_size =
>               ROUND_UP(tables_blob->len - aml_len + legacy_aml_len,
>                        ACPI_BUILD_ALIGN_SIZE);
> @@ -2880,13 +2884,14 @@ void acpi_setup(void)
>   {
>       PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
>       PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>       AcpiBuildTables tables;
>       AcpiBuildState *build_state;
>       Object *vmgenid_dev;
>       TPMIf *tpm;
>       static FwCfgTPMConfig tpm_config;
>   
> -    if (!pcms->fw_cfg) {
> +    if (!x86ms->fw_cfg) {
>           ACPI_BUILD_DPRINTF("No fw cfg. Bailing out.\n");
>           return;
>       }
> @@ -2917,7 +2922,7 @@ void acpi_setup(void)
>           acpi_add_rom_blob(acpi_build_update, build_state,
>                             tables.linker->cmd_blob, "etc/table-loader", 0);
>   
> -    fw_cfg_add_file(pcms->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
> +    fw_cfg_add_file(x86ms->fw_cfg, ACPI_BUILD_TPMLOG_FILE,
>                       tables.tcpalog->data, acpi_data_len(tables.tcpalog));
>   
>       tpm = tpm_find();
> @@ -2927,13 +2932,13 @@ void acpi_setup(void)
>               .tpm_version = tpm_get_version(tpm),
>               .tpmppi_version = TPM_PPI_VERSION_1_30
>           };
> -        fw_cfg_add_file(pcms->fw_cfg, "etc/tpm/config",
> +        fw_cfg_add_file(x86ms->fw_cfg, "etc/tpm/config",
>                           &tpm_config, sizeof tpm_config);
>       }
>   
>       vmgenid_dev = find_vmgenid_dev();
>       if (vmgenid_dev) {
> -        vmgenid_add_fw_cfg(VMGENID(vmgenid_dev), pcms->fw_cfg,
> +        vmgenid_add_fw_cfg(VMGENID(vmgenid_dev), x86ms->fw_cfg,
>                              tables.vmgenid);
>       }
>   
> @@ -2946,7 +2951,7 @@ void acpi_setup(void)
>           uint32_t rsdp_size = acpi_data_len(tables.rsdp);
>   
>           build_state->rsdp = g_memdup(tables.rsdp->data, rsdp_size);
> -        fw_cfg_add_file_callback(pcms->fw_cfg, ACPI_BUILD_RSDP_FILE,
> +        fw_cfg_add_file_callback(x86ms->fw_cfg, ACPI_BUILD_RSDP_FILE,
>                                    acpi_build_update, NULL, build_state,
>                                    build_state->rsdp, rsdp_size, true);
>           build_state->rsdp_mr = NULL;
> diff --git a/hw/i386/amd_iommu.c b/hw/i386/amd_iommu.c
> index 08884523e2..7b7e4a0bf7 100644
> --- a/hw/i386/amd_iommu.c
> +++ b/hw/i386/amd_iommu.c
> @@ -1537,6 +1537,7 @@ static void amdvi_realize(DeviceState *dev, Error **err)
>       X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
>       MachineState *ms = MACHINE(qdev_get_machine());
>       PCMachineState *pcms = PC_MACHINE(ms);
> +    X86MachineState *x86ms = X86_MACHINE(ms);
>       PCIBus *bus = pcms->bus;
>   
>       s->iotlb = g_hash_table_new_full(amdvi_uint64_hash,
> @@ -1565,7 +1566,7 @@ static void amdvi_realize(DeviceState *dev, Error **err)
>       }
>   
>       /* Pseudo address space under root PCI bus. */
> -    pcms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);
> +    x86ms->ioapic_as = amdvi_host_dma_iommu(bus, s, AMDVI_IOAPIC_SB_DEVID);
>   
>       /* set up MMIO */
>       memory_region_init_io(&s->mmio, OBJECT(s), &mmio_mem_ops, s, "amdvi-mmio",
> diff --git a/hw/i386/intel_iommu.c b/hw/i386/intel_iommu.c
> index f1de8fdb75..9dc20c160e 100644
> --- a/hw/i386/intel_iommu.c
> +++ b/hw/i386/intel_iommu.c
> @@ -3731,6 +3731,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
>   {
>       MachineState *ms = MACHINE(qdev_get_machine());
>       PCMachineState *pcms = PC_MACHINE(ms);
> +    X86MachineState *x86ms = X86_MACHINE(ms);
>       PCIBus *bus = pcms->bus;
>       IntelIOMMUState *s = INTEL_IOMMU_DEVICE(dev);
>       X86IOMMUState *x86_iommu = X86_IOMMU_DEVICE(dev);
> @@ -3771,7 +3772,7 @@ static void vtd_realize(DeviceState *dev, Error **errp)
>       sysbus_mmio_map(SYS_BUS_DEVICE(s), 0, Q35_HOST_BRIDGE_IOMMU_ADDR);
>       pci_setup_iommu(bus, vtd_host_dma_iommu, dev);
>       /* Pseudo address space under root PCI bus. */
> -    pcms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
> +    x86ms->ioapic_as = vtd_host_dma_iommu(bus, s, Q35_PSEUDO_DEVFN_IOAPIC);
>       qemu_add_machine_init_done_notifier(&vtd_machine_done_notify);
>   }
>   
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 094db79fb0..0dc1420a1f 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -78,7 +78,6 @@
>   #include "qapi/qapi-visit-common.h"
>   #include "qapi/visitor.h"
>   #include "hw/core/cpu.h"
> -#include "hw/nmi.h"
>   #include "hw/usb.h"
>   #include "hw/i386/intel_iommu.h"
>   #include "hw/net/ne2000-isa.h"
> @@ -679,17 +678,18 @@ void pc_cmos_init(PCMachineState *pcms,
>   {
>       int val;
>       static pc_cmos_init_late_arg arg;
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>   
>       /* various important CMOS locations needed by PC/Bochs bios */
>   
>       /* memory size */
>       /* base memory (first MiB) */
> -    val = MIN(pcms->below_4g_mem_size / KiB, 640);
> +    val = MIN(x86ms->below_4g_mem_size / KiB, 640);
>       rtc_set_memory(s, 0x15, val);
>       rtc_set_memory(s, 0x16, val >> 8);
>       /* extended memory (next 64MiB) */
> -    if (pcms->below_4g_mem_size > 1 * MiB) {
> -        val = (pcms->below_4g_mem_size - 1 * MiB) / KiB;
> +    if (x86ms->below_4g_mem_size > 1 * MiB) {
> +        val = (x86ms->below_4g_mem_size - 1 * MiB) / KiB;
>       } else {
>           val = 0;
>       }
> @@ -700,8 +700,8 @@ void pc_cmos_init(PCMachineState *pcms,
>       rtc_set_memory(s, 0x30, val);
>       rtc_set_memory(s, 0x31, val >> 8);
>       /* memory between 16MiB and 4GiB */
> -    if (pcms->below_4g_mem_size > 16 * MiB) {
> -        val = (pcms->below_4g_mem_size - 16 * MiB) / (64 * KiB);
> +    if (x86ms->below_4g_mem_size > 16 * MiB) {
> +        val = (x86ms->below_4g_mem_size - 16 * MiB) / (64 * KiB);
>       } else {
>           val = 0;
>       }
> @@ -710,14 +710,14 @@ void pc_cmos_init(PCMachineState *pcms,
>       rtc_set_memory(s, 0x34, val);
>       rtc_set_memory(s, 0x35, val >> 8);
>       /* memory above 4GiB */
> -    val = pcms->above_4g_mem_size / 65536;
> +    val = x86ms->above_4g_mem_size / 65536;
>       rtc_set_memory(s, 0x5b, val);
>       rtc_set_memory(s, 0x5c, val >> 8);
>       rtc_set_memory(s, 0x5d, val >> 16);
>   
>       object_property_add_link(OBJECT(pcms), "rtc_state",
>                                TYPE_ISA_DEVICE,
> -                             (Object **)&pcms->rtc,
> +                             (Object **)&x86ms->rtc,
>                                object_property_allow_set_link,
>                                OBJ_PROP_LINK_STRONG, &error_abort);
>       object_property_set_link(OBJECT(pcms), OBJECT(s),
> @@ -906,7 +906,7 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
>    */
>   void pc_smp_parse(MachineState *ms, QemuOpts *opts)
>   {
> -    PCMachineState *pcms = PC_MACHINE(ms);
> +    X86MachineState *x86ms = X86_MACHINE(ms);
>   
>       if (opts) {
>           unsigned cpus    = qemu_opt_get_number(opts, "cpus", 0);
> @@ -970,7 +970,7 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
>           ms->smp.cpus = cpus;
>           ms->smp.cores = cores;
>           ms->smp.threads = threads;
> -        pcms->smp_dies = dies;
> +        x86ms->smp_dies = dies;
>       }
>   
>       if (ms->smp.cpus > 1) {
> @@ -1023,10 +1023,11 @@ void pc_machine_done(Notifier *notifier, void *data)
>   {
>       PCMachineState *pcms = container_of(notifier,
>                                           PCMachineState, machine_done);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>       PCIBus *bus = pcms->bus;
>   
>       /* set the number of CPUs */
> -    rtc_set_cpus_count(pcms->rtc, pcms->boot_cpus);
> +    rtc_set_cpus_count(x86ms->rtc, x86ms->boot_cpus);
>   
>       if (bus) {
>           int extra_hosts = 0;
> @@ -1037,23 +1038,23 @@ void pc_machine_done(Notifier *notifier, void *data)
>                   extra_hosts++;
>               }
>           }
> -        if (extra_hosts && pcms->fw_cfg) {
> +        if (extra_hosts && x86ms->fw_cfg) {
>               uint64_t *val = g_malloc(sizeof(*val));
>               *val = cpu_to_le64(extra_hosts);
> -            fw_cfg_add_file(pcms->fw_cfg,
> +            fw_cfg_add_file(x86ms->fw_cfg,
>                       "etc/extra-pci-roots", val, sizeof(*val));
>           }
>       }
>   
>       acpi_setup();
> -    if (pcms->fw_cfg) {
> -        fw_cfg_build_smbios(MACHINE(pcms), pcms->fw_cfg);
> -        fw_cfg_build_feature_control(MACHINE(pcms), pcms->fw_cfg);
> +    if (x86ms->fw_cfg) {
> +        fw_cfg_build_smbios(MACHINE(pcms), x86ms->fw_cfg);
> +        fw_cfg_build_feature_control(MACHINE(pcms), x86ms->fw_cfg);
>           /* update FW_CFG_NB_CPUS to account for -device added CPUs */
> -        fw_cfg_modify_i16(pcms->fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
> +        fw_cfg_modify_i16(x86ms->fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
>       }
>   
> -    if (pcms->apic_id_limit > 255 && !xen_enabled()) {
> +    if (x86ms->apic_id_limit > 255 && !xen_enabled()) {
>           IntelIOMMUState *iommu = INTEL_IOMMU_DEVICE(x86_iommu_get_default());
>   
>           if (!iommu || !x86_iommu_ir_supported(X86_IOMMU_DEVICE(iommu)) ||
> @@ -1071,8 +1072,9 @@ void pc_guest_info_init(PCMachineState *pcms)
>   {
>       int i;
>       MachineState *ms = MACHINE(pcms);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>   
> -    pcms->apic_xrupt_override = kvm_allows_irq0_override();
> +    x86ms->apic_xrupt_override = kvm_allows_irq0_override();
>       pcms->numa_nodes = ms->numa_state->num_nodes;
>       pcms->node_mem = g_malloc0(pcms->numa_nodes *
>                                       sizeof *pcms->node_mem);
> @@ -1097,11 +1099,12 @@ void xen_load_linux(PCMachineState *pcms)
>   {
>       int i;
>       FWCfgState *fw_cfg;
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>   
>       assert(MACHINE(pcms)->kernel_filename != NULL);
>   
>       fw_cfg = fw_cfg_init_io(FW_CFG_IO_BASE);
> -    fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
> +    fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
>       rom_set_fw(fw_cfg);
>   
>       x86_load_linux(pcms, fw_cfg);
> @@ -1112,7 +1115,7 @@ void xen_load_linux(PCMachineState *pcms)
>                  !strcmp(option_rom[i].name, "multiboot.bin"));
>           rom_add_option(option_rom[i].name, option_rom[i].bootindex);
>       }
> -    pcms->fw_cfg = fw_cfg;
> +    x86ms->fw_cfg = fw_cfg;
>   }
>   
>   void pc_memory_init(PCMachineState *pcms,
> @@ -1127,9 +1130,10 @@ void pc_memory_init(PCMachineState *pcms,
>       MachineState *machine = MACHINE(pcms);
>       MachineClass *mc = MACHINE_GET_CLASS(machine);
>       PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>   
> -    assert(machine->ram_size == pcms->below_4g_mem_size +
> -                                pcms->above_4g_mem_size);
> +    assert(machine->ram_size == x86ms->below_4g_mem_size +
> +                                x86ms->above_4g_mem_size);
>   
>       linux_boot = (machine->kernel_filename != NULL);
>   
> @@ -1143,17 +1147,17 @@ void pc_memory_init(PCMachineState *pcms,
>       *ram_memory = ram;
>       ram_below_4g = g_malloc(sizeof(*ram_below_4g));
>       memory_region_init_alias(ram_below_4g, NULL, "ram-below-4g", ram,
> -                             0, pcms->below_4g_mem_size);
> +                             0, x86ms->below_4g_mem_size);
>       memory_region_add_subregion(system_memory, 0, ram_below_4g);
> -    e820_add_entry(0, pcms->below_4g_mem_size, E820_RAM);
> -    if (pcms->above_4g_mem_size > 0) {
> +    e820_add_entry(0, x86ms->below_4g_mem_size, E820_RAM);
> +    if (x86ms->above_4g_mem_size > 0) {
>           ram_above_4g = g_malloc(sizeof(*ram_above_4g));
>           memory_region_init_alias(ram_above_4g, NULL, "ram-above-4g", ram,
> -                                 pcms->below_4g_mem_size,
> -                                 pcms->above_4g_mem_size);
> +                                 x86ms->below_4g_mem_size,
> +                                 x86ms->above_4g_mem_size);
>           memory_region_add_subregion(system_memory, 0x100000000ULL,
>                                       ram_above_4g);
> -        e820_add_entry(0x100000000ULL, pcms->above_4g_mem_size, E820_RAM);
> +        e820_add_entry(0x100000000ULL, x86ms->above_4g_mem_size, E820_RAM);
>       }
>   
>       if (!pcmc->has_reserved_memory &&
> @@ -1187,7 +1191,7 @@ void pc_memory_init(PCMachineState *pcms,
>           }
>   
>           machine->device_memory->base =
> -            ROUND_UP(0x100000000ULL + pcms->above_4g_mem_size, 1 * GiB);
> +            ROUND_UP(0x100000000ULL + x86ms->above_4g_mem_size, 1 * GiB);
>   
>           if (pcmc->enforce_aligned_dimm) {
>               /* size device region assuming 1G page max alignment per slot */
> @@ -1222,7 +1226,7 @@ void pc_memory_init(PCMachineState *pcms,
>                                           1);
>   
>       fw_cfg = fw_cfg_arch_create(machine,
> -                                pcms->boot_cpus, pcms->apic_id_limit);
> +                                x86ms->boot_cpus, x86ms->apic_id_limit);
>   
>       rom_set_fw(fw_cfg);
>   
> @@ -1245,10 +1249,10 @@ void pc_memory_init(PCMachineState *pcms,
>       for (i = 0; i < nb_option_roms; i++) {
>           rom_add_option(option_rom[i].name, option_rom[i].bootindex);
>       }
> -    pcms->fw_cfg = fw_cfg;
> +    x86ms->fw_cfg = fw_cfg;
>   
>       /* Init default IOAPIC address space */
> -    pcms->ioapic_as = &address_space_memory;
> +    x86ms->ioapic_as = &address_space_memory;
>   }
>   
>   /*
> @@ -1260,6 +1264,7 @@ uint64_t pc_pci_hole64_start(void)
>       PCMachineState *pcms = PC_MACHINE(qdev_get_machine());
>       PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
>       MachineState *ms = MACHINE(pcms);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>       uint64_t hole64_start = 0;
>   
>       if (pcmc->has_reserved_memory && ms->device_memory->base) {
> @@ -1268,7 +1273,7 @@ uint64_t pc_pci_hole64_start(void)
>               hole64_start += memory_region_size(&ms->device_memory->mr);
>           }
>       } else {
> -        hole64_start = 0x100000000ULL + pcms->above_4g_mem_size;
> +        hole64_start = 0x100000000ULL + x86ms->above_4g_mem_size;
>       }
>   
>       return ROUND_UP(hole64_start, 1 * GiB);
> @@ -1607,6 +1612,7 @@ static void pc_cpu_plug(HotplugHandler *hotplug_dev,
>       Error *local_err = NULL;
>       X86CPU *cpu = X86_CPU(dev);
>       PCMachineState *pcms = PC_MACHINE(hotplug_dev);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>   
>       if (pcms->acpi_dev) {
>           hotplug_handler_plug(HOTPLUG_HANDLER(pcms->acpi_dev), dev, &local_err);
> @@ -1616,12 +1622,12 @@ static void pc_cpu_plug(HotplugHandler *hotplug_dev,
>       }
>   
>       /* increment the number of CPUs */
> -    pcms->boot_cpus++;
> -    if (pcms->rtc) {
> -        rtc_set_cpus_count(pcms->rtc, pcms->boot_cpus);
> +    x86ms->boot_cpus++;
> +    if (x86ms->rtc) {
> +        rtc_set_cpus_count(x86ms->rtc, x86ms->boot_cpus);
>       }
> -    if (pcms->fw_cfg) {
> -        fw_cfg_modify_i16(pcms->fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
> +    if (x86ms->fw_cfg) {
> +        fw_cfg_modify_i16(x86ms->fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
>       }
>   
>       found_cpu = pc_find_cpu_slot(MACHINE(pcms), cpu->apic_id, NULL);
> @@ -1667,6 +1673,7 @@ static void pc_cpu_unplug_cb(HotplugHandler *hotplug_dev,
>       Error *local_err = NULL;
>       X86CPU *cpu = X86_CPU(dev);
>       PCMachineState *pcms = PC_MACHINE(hotplug_dev);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>   
>       hotplug_handler_unplug(HOTPLUG_HANDLER(pcms->acpi_dev), dev, &local_err);
>       if (local_err) {
> @@ -1678,10 +1685,10 @@ static void pc_cpu_unplug_cb(HotplugHandler *hotplug_dev,
>       object_property_set_bool(OBJECT(dev), false, "realized", NULL);
>   
>       /* decrement the number of CPUs */
> -    pcms->boot_cpus--;
> +    x86ms->boot_cpus--;
>       /* Update the number of CPUs in CMOS */
> -    rtc_set_cpus_count(pcms->rtc, pcms->boot_cpus);
> -    fw_cfg_modify_i16(pcms->fw_cfg, FW_CFG_NB_CPUS, pcms->boot_cpus);
> +    rtc_set_cpus_count(x86ms->rtc, x86ms->boot_cpus);
> +    fw_cfg_modify_i16(x86ms->fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
>    out:
>       error_propagate(errp, local_err);
>   }
> @@ -1697,6 +1704,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>       CPUX86State *env = &cpu->env;
>       MachineState *ms = MACHINE(hotplug_dev);
>       PCMachineState *pcms = PC_MACHINE(hotplug_dev);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>       unsigned int smp_cores = ms->smp.cores;
>       unsigned int smp_threads = ms->smp.threads;
>   
> @@ -1706,7 +1714,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>           return;
>       }
>   
> -    env->nr_dies = pcms->smp_dies;
> +    env->nr_dies = x86ms->smp_dies;
>   
>       /*
>        * If APIC ID is not set,
> @@ -1714,13 +1722,13 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>        */
>       if (cpu->apic_id == UNASSIGNED_APIC_ID) {
>           int max_socket = (ms->smp.max_cpus - 1) /
> -                                smp_threads / smp_cores / pcms->smp_dies;
> +                                smp_threads / smp_cores / x86ms->smp_dies;
>   
>           /*
>            * die-id was optional in QEMU 4.0 and older, so keep it optional
>            * if there's only one die per socket.
>            */
> -        if (cpu->die_id < 0 && pcms->smp_dies == 1) {
> +        if (cpu->die_id < 0 && x86ms->smp_dies == 1) {
>               cpu->die_id = 0;
>           }
>   
> @@ -1735,9 +1743,9 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>           if (cpu->die_id < 0) {
>               error_setg(errp, "CPU die-id is not set");
>               return;
> -        } else if (cpu->die_id > pcms->smp_dies - 1) {
> +        } else if (cpu->die_id > x86ms->smp_dies - 1) {
>               error_setg(errp, "Invalid CPU die-id: %u must be in range 0:%u",
> -                       cpu->die_id, pcms->smp_dies - 1);
> +                       cpu->die_id, x86ms->smp_dies - 1);
>               return;
>           }
>           if (cpu->core_id < 0) {
> @@ -1761,7 +1769,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>           topo.die_id = cpu->die_id;
>           topo.core_id = cpu->core_id;
>           topo.smt_id = cpu->thread_id;
> -        cpu->apic_id = apicid_from_topo_ids(pcms->smp_dies, smp_cores,
> +        cpu->apic_id = apicid_from_topo_ids(x86ms->smp_dies, smp_cores,
>                                               smp_threads, &topo);
>       }
>   
> @@ -1769,7 +1777,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>       if (!cpu_slot) {
>           MachineState *ms = MACHINE(pcms);
>   
> -        x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
> +        x86_topo_ids_from_apicid(cpu->apic_id, x86ms->smp_dies,
>                                    smp_cores, smp_threads, &topo);
>           error_setg(errp,
>               "Invalid CPU [socket: %u, die: %u, core: %u, thread: %u] with"
> @@ -1791,7 +1799,7 @@ static void pc_cpu_pre_plug(HotplugHandler *hotplug_dev,
>       /* TODO: move socket_id/core_id/thread_id checks into x86_cpu_realizefn()
>        * once -smp refactoring is complete and there will be CPU private
>        * CPUState::nr_cores and CPUState::nr_threads fields instead of globals */
> -    x86_topo_ids_from_apicid(cpu->apic_id, pcms->smp_dies,
> +    x86_topo_ids_from_apicid(cpu->apic_id, x86ms->smp_dies,
>                                smp_cores, smp_threads, &topo);
>       if (cpu->socket_id != -1 && cpu->socket_id != topo.pkg_id) {
>           error_setg(errp, "property socket-id: %u doesn't match set apic-id:"
> @@ -1973,45 +1981,6 @@ pc_machine_get_device_memory_region_size(Object *obj, Visitor *v,
>       visit_type_int(v, name, &value, errp);
>   }
>   
> -static void pc_machine_get_max_ram_below_4g(Object *obj, Visitor *v,
> -                                            const char *name, void *opaque,
> -                                            Error **errp)
> -{
> -    PCMachineState *pcms = PC_MACHINE(obj);
> -    uint64_t value = pcms->max_ram_below_4g;
> -
> -    visit_type_size(v, name, &value, errp);
> -}
> -
> -static void pc_machine_set_max_ram_below_4g(Object *obj, Visitor *v,
> -                                            const char *name, void *opaque,
> -                                            Error **errp)
> -{
> -    PCMachineState *pcms = PC_MACHINE(obj);
> -    Error *error = NULL;
> -    uint64_t value;
> -
> -    visit_type_size(v, name, &value, &error);
> -    if (error) {
> -        error_propagate(errp, error);
> -        return;
> -    }
> -    if (value > 4 * GiB) {
> -        error_setg(&error,
> -                   "Machine option 'max-ram-below-4g=%"PRIu64
> -                   "' expects size less than or equal to 4G", value);
> -        error_propagate(errp, error);
> -        return;
> -    }
> -
> -    if (value < 1 * MiB) {
> -        warn_report("Only %" PRIu64 " bytes of RAM below the 4GiB boundary,"
> -                    "BIOS may not work with less than 1MiB", value);
> -    }
> -
> -    pcms->max_ram_below_4g = value;
> -}
> -
>   static void pc_machine_get_vmport(Object *obj, Visitor *v, const char *name,
>                                     void *opaque, Error **errp)
>   {
> @@ -2117,7 +2086,6 @@ static void pc_machine_initfn(Object *obj)
>   {
>       PCMachineState *pcms = PC_MACHINE(obj);
>   
> -    pcms->max_ram_below_4g = 0; /* use default */
>       pcms->smm = ON_OFF_AUTO_AUTO;
>   #ifdef CONFIG_VMPORT
>       pcms->vmport = ON_OFF_AUTO_AUTO;
> @@ -2129,7 +2097,6 @@ static void pc_machine_initfn(Object *obj)
>       pcms->smbus_enabled = true;
>       pcms->sata_enabled = true;
>       pcms->pit_enabled = true;
> -    pcms->smp_dies = 1;
>   
>       pc_system_flash_create(pcms);
>   }
> @@ -2160,23 +2127,6 @@ static void pc_machine_wakeup(MachineState *machine)
>       cpu_synchronize_all_post_reset();
>   }
>   
> -static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
> -{
> -    /* cpu index isn't used */
> -    CPUState *cs;
> -
> -    CPU_FOREACH(cs) {
> -        X86CPU *cpu = X86_CPU(cs);
> -
> -        if (!cpu->apic_state) {
> -            cpu_interrupt(cs, CPU_INTERRUPT_NMI);
> -        } else {
> -            apic_deliver_nmi(cpu->apic_state);
> -        }
> -    }
> -}
> -
> -
>   static bool pc_hotplug_allowed(MachineState *ms, DeviceState *dev, Error **errp)
>   {
>       X86IOMMUState *iommu = x86_iommu_get_default();
> @@ -2201,7 +2151,6 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
>       MachineClass *mc = MACHINE_CLASS(oc);
>       PCMachineClass *pcmc = PC_MACHINE_CLASS(oc);
>       HotplugHandlerClass *hc = HOTPLUG_HANDLER_CLASS(oc);
> -    NMIClass *nc = NMI_CLASS(oc);
>   
>       pcmc->pci_enabled = true;
>       pcmc->has_acpi_build = true;
> @@ -2237,7 +2186,6 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
>       hc->plug = pc_machine_device_plug_cb;
>       hc->unplug_request = pc_machine_device_unplug_request_cb;
>       hc->unplug = pc_machine_device_unplug_cb;
> -    nc->nmi_monitor_handler = x86_nmi;
>       mc->default_cpu_type = TARGET_DEFAULT_CPU_TYPE;
>       mc->nvdimm_supported = true;
>       mc->numa_mem_supported = true;
> @@ -2246,13 +2194,6 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
>           pc_machine_get_device_memory_region_size, NULL,
>           NULL, NULL, &error_abort);
>   
> -    object_class_property_add(oc, PC_MACHINE_MAX_RAM_BELOW_4G, "size",
> -        pc_machine_get_max_ram_below_4g, pc_machine_set_max_ram_below_4g,
> -        NULL, NULL, &error_abort);
> -
> -    object_class_property_set_description(oc, PC_MACHINE_MAX_RAM_BELOW_4G,
> -        "Maximum ram below the 4G boundary (32bit boundary)", &error_abort);
> -
>       object_class_property_add(oc, PC_MACHINE_SMM, "OnOffAuto",
>           pc_machine_get_smm, pc_machine_set_smm,
>           NULL, NULL, &error_abort);
> @@ -2277,7 +2218,7 @@ static void pc_machine_class_init(ObjectClass *oc, void *data)
>   
>   static const TypeInfo pc_machine_info = {
>       .name = TYPE_PC_MACHINE,
> -    .parent = TYPE_MACHINE,
> +    .parent = TYPE_X86_MACHINE,
>       .abstract = true,
>       .instance_size = sizeof(PCMachineState),
>       .instance_init = pc_machine_initfn,
> @@ -2285,7 +2226,6 @@ static const TypeInfo pc_machine_info = {
>       .class_init = pc_machine_class_init,
>       .interfaces = (InterfaceInfo[]) {
>            { TYPE_HOTPLUG_HANDLER },
> -         { TYPE_NMI },
>            { }
>       },
>   };
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 1396451abf..0afa8fe6ea 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -74,6 +74,7 @@ static void pc_init1(MachineState *machine,
>   {
>       PCMachineState *pcms = PC_MACHINE(machine);
>       PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    X86MachineState *x86ms = X86_MACHINE(machine);
>       MemoryRegion *system_memory = get_system_memory();
>       MemoryRegion *system_io = get_system_io();
>       int i;
> @@ -126,11 +127,11 @@ static void pc_init1(MachineState *machine,
>       if (xen_enabled()) {
>           xen_hvm_init(pcms, &ram_memory);
>       } else {
> -        if (!pcms->max_ram_below_4g) {
> -            pcms->max_ram_below_4g = 0xe0000000; /* default: 3.5G */
> +        if (!x86ms->max_ram_below_4g) {
> +            x86ms->max_ram_below_4g = 0xe0000000; /* default: 3.5G */
>           }
> -        lowmem = pcms->max_ram_below_4g;
> -        if (machine->ram_size >= pcms->max_ram_below_4g) {
> +        lowmem = x86ms->max_ram_below_4g;
> +        if (machine->ram_size >= x86ms->max_ram_below_4g) {
>               if (pcmc->gigabyte_align) {
>                   if (lowmem > 0xc0000000) {
>                       lowmem = 0xc0000000;
> @@ -139,17 +140,17 @@ static void pc_init1(MachineState *machine,
>                       warn_report("Large machine and max_ram_below_4g "
>                                   "(%" PRIu64 ") not a multiple of 1G; "
>                                   "possible bad performance.",
> -                                pcms->max_ram_below_4g);
> +                                x86ms->max_ram_below_4g);
>                   }
>               }
>           }
>   
>           if (machine->ram_size >= lowmem) {
> -            pcms->above_4g_mem_size = machine->ram_size - lowmem;
> -            pcms->below_4g_mem_size = lowmem;
> +            x86ms->above_4g_mem_size = machine->ram_size - lowmem;
> +            x86ms->below_4g_mem_size = lowmem;
>           } else {
> -            pcms->above_4g_mem_size = 0;
> -            pcms->below_4g_mem_size = machine->ram_size;
> +            x86ms->above_4g_mem_size = 0;
> +            x86ms->below_4g_mem_size = machine->ram_size;
>           }
>       }
>   
> @@ -191,19 +192,19 @@ static void pc_init1(MachineState *machine,
>       gsi_state = g_malloc0(sizeof(*gsi_state));
>       if (kvm_ioapic_in_kernel()) {
>           kvm_pc_setup_irq_routing(pcmc->pci_enabled);
> -        pcms->gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
> +        x86ms->gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
>                                          GSI_NUM_PINS);
>       } else {
> -        pcms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
> +        x86ms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
>       }
>   
>       if (pcmc->pci_enabled) {
>           pci_bus = i440fx_init(host_type,
>                                 pci_type,
> -                              &i440fx_state, &piix3_devfn, &isa_bus, pcms->gsi,
> +                              &i440fx_state, &piix3_devfn, &isa_bus, x86ms->gsi,
>                                 system_memory, system_io, machine->ram_size,
> -                              pcms->below_4g_mem_size,
> -                              pcms->above_4g_mem_size,
> +                              x86ms->below_4g_mem_size,
> +                              x86ms->above_4g_mem_size,
>                                 pci_memory, ram_memory);
>           pcms->bus = pci_bus;
>       } else {
> @@ -213,7 +214,7 @@ static void pc_init1(MachineState *machine,
>                                 &error_abort);
>           no_hpet = 1;
>       }
> -    isa_bus_irqs(isa_bus, pcms->gsi);
> +    isa_bus_irqs(isa_bus, x86ms->gsi);
>   
>       if (kvm_pic_in_kernel()) {
>           i8259 = kvm_i8259_init(isa_bus);
> @@ -231,7 +232,7 @@ static void pc_init1(MachineState *machine,
>           ioapic_init_gsi(gsi_state, "i440fx");
>       }
>   
> -    pc_register_ferr_irq(pcms->gsi[13]);
> +    pc_register_ferr_irq(x86ms->gsi[13]);
>   
>       pc_vga_init(isa_bus, pcmc->pci_enabled ? pci_bus : NULL);
>   
> @@ -241,7 +242,7 @@ static void pc_init1(MachineState *machine,
>       }
>   
>       /* init basic PC hardware */
> -    pc_basic_device_init(isa_bus, pcms->gsi, &rtc_state, true,
> +    pc_basic_device_init(isa_bus, x86ms->gsi, &rtc_state, true,
>                            (pcms->vmport != ON_OFF_AUTO_ON), pcms->pit_enabled,
>                            0x4);
>   
> @@ -288,7 +289,7 @@ else {
>           smi_irq = qemu_allocate_irq(pc_acpi_smi_interrupt, first_cpu, 0);
>           /* TODO: Populate SPD eeprom data.  */
>           pcms->smbus = piix4_pm_init(pci_bus, piix3_devfn + 3, 0xb100,
> -                                    pcms->gsi[9], smi_irq,
> +                                    x86ms->gsi[9], smi_irq,
>                                       pc_machine_is_smm_enabled(pcms),
>                                       &piix4_pm);
>           smbus_eeprom_init(pcms->smbus, 8, NULL, 0);
> @@ -304,7 +305,7 @@ else {
>   
>       if (machine->nvdimms_state->is_enabled) {
>           nvdimm_init_acpi_state(machine->nvdimms_state, system_io,
> -                               pcms->fw_cfg, OBJECT(pcms));
> +                               x86ms->fw_cfg, OBJECT(pcms));
>       }
>   }
>   
> @@ -729,7 +730,7 @@ DEFINE_I440FX_MACHINE(v1_4, "pc-i440fx-1.4", pc_compat_1_4_fn,
>   
>   static void pc_i440fx_1_3_machine_options(MachineClass *m)
>   {
> -    PCMachineClass *pcmc = PC_MACHINE_CLASS(m);
> +    X86MachineClass *x86mc = X86_MACHINE_CLASS(m);
>       static GlobalProperty compat[] = {
>           PC_CPU_MODEL_IDS("1.3.0")
>           { "usb-tablet", "usb_version", "1" },
> @@ -740,7 +741,7 @@ static void pc_i440fx_1_3_machine_options(MachineClass *m)
>   
>       pc_i440fx_1_4_machine_options(m);
>       m->hw_version = "1.3.0";
> -    pcmc->compat_apic_id_mode = true;
> +    x86mc->compat_apic_id_mode = true;
>       compat_props_add(m->compat_props, compat, G_N_ELEMENTS(compat));
>   }
>   
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 8920bd8978..8e7beb9415 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -116,6 +116,7 @@ static void pc_q35_init(MachineState *machine)
>   {
>       PCMachineState *pcms = PC_MACHINE(machine);
>       PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    X86MachineState *x86ms = X86_MACHINE(machine);
>       Q35PCIHost *q35_host;
>       PCIHostState *phb;
>       PCIBus *host_bus;
> @@ -153,27 +154,27 @@ static void pc_q35_init(MachineState *machine)
>       /* Handle the machine opt max-ram-below-4g.  It is basically doing
>        * min(qemu limit, user limit).
>        */
> -    if (!pcms->max_ram_below_4g) {
> -        pcms->max_ram_below_4g = 1ULL << 32; /* default: 4G */;
> +    if (!x86ms->max_ram_below_4g) {
> +        x86ms->max_ram_below_4g = 1ULL << 32; /* default: 4G */;
>       }
> -    if (lowmem > pcms->max_ram_below_4g) {
> -        lowmem = pcms->max_ram_below_4g;
> +    if (lowmem > x86ms->max_ram_below_4g) {
> +        lowmem = x86ms->max_ram_below_4g;
>           if (machine->ram_size - lowmem > lowmem &&
>               lowmem & (1 * GiB - 1)) {
>               warn_report("There is possibly poor performance as the ram size "
>                           " (0x%" PRIx64 ") is more then twice the size of"
>                           " max-ram-below-4g (%"PRIu64") and"
>                           " max-ram-below-4g is not a multiple of 1G.",
> -                        (uint64_t)machine->ram_size, pcms->max_ram_below_4g);
> +                        (uint64_t)machine->ram_size, x86ms->max_ram_below_4g);
>           }
>       }
>   
>       if (machine->ram_size >= lowmem) {
> -        pcms->above_4g_mem_size = machine->ram_size - lowmem;
> -        pcms->below_4g_mem_size = lowmem;
> +        x86ms->above_4g_mem_size = machine->ram_size - lowmem;
> +        x86ms->below_4g_mem_size = lowmem;
>       } else {
> -        pcms->above_4g_mem_size = 0;
> -        pcms->below_4g_mem_size = machine->ram_size;
> +        x86ms->above_4g_mem_size = 0;
> +        x86ms->below_4g_mem_size = machine->ram_size;
>       }
>   
>       if (xen_enabled()) {
> @@ -214,10 +215,10 @@ static void pc_q35_init(MachineState *machine)
>       gsi_state = g_malloc0(sizeof(*gsi_state));
>       if (kvm_ioapic_in_kernel()) {
>           kvm_pc_setup_irq_routing(pcmc->pci_enabled);
> -        pcms->gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
> +        x86ms->gsi = qemu_allocate_irqs(kvm_pc_gsi_handler, gsi_state,
>                                          GSI_NUM_PINS);
>       } else {
> -        pcms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
> +        x86ms->gsi = qemu_allocate_irqs(gsi_handler, gsi_state, GSI_NUM_PINS);
>       }
>   
>       /* create pci host bus */
> @@ -232,9 +233,9 @@ static void pc_q35_init(MachineState *machine)
>                                MCH_HOST_PROP_SYSTEM_MEM, NULL);
>       object_property_set_link(OBJECT(q35_host), OBJECT(system_io),
>                                MCH_HOST_PROP_IO_MEM, NULL);
> -    object_property_set_int(OBJECT(q35_host), pcms->below_4g_mem_size,
> +    object_property_set_int(OBJECT(q35_host), x86ms->below_4g_mem_size,
>                               PCI_HOST_BELOW_4G_MEM_SIZE, NULL);
> -    object_property_set_int(OBJECT(q35_host), pcms->above_4g_mem_size,
> +    object_property_set_int(OBJECT(q35_host), x86ms->above_4g_mem_size,
>                               PCI_HOST_ABOVE_4G_MEM_SIZE, NULL);
>       /* pci */
>       qdev_init_nofail(DEVICE(q35_host));
> @@ -256,7 +257,7 @@ static void pc_q35_init(MachineState *machine)
>       ich9_lpc = ICH9_LPC_DEVICE(lpc);
>       lpc_dev = DEVICE(lpc);
>       for (i = 0; i < GSI_NUM_PINS; i++) {
> -        qdev_connect_gpio_out_named(lpc_dev, ICH9_GPIO_GSI, i, pcms->gsi[i]);
> +        qdev_connect_gpio_out_named(lpc_dev, ICH9_GPIO_GSI, i, x86ms->gsi[i]);
>       }
>       pci_bus_irqs(host_bus, ich9_lpc_set_irq, ich9_lpc_map_irq, ich9_lpc,
>                    ICH9_LPC_NB_PIRQS);
> @@ -280,7 +281,7 @@ static void pc_q35_init(MachineState *machine)
>           ioapic_init_gsi(gsi_state, "q35");
>       }
>   
> -    pc_register_ferr_irq(pcms->gsi[13]);
> +    pc_register_ferr_irq(x86ms->gsi[13]);
>   
>       assert(pcms->vmport != ON_OFF_AUTO__MAX);
>       if (pcms->vmport == ON_OFF_AUTO_AUTO) {
> @@ -288,7 +289,7 @@ static void pc_q35_init(MachineState *machine)
>       }
>   
>       /* init basic PC hardware */
> -    pc_basic_device_init(isa_bus, pcms->gsi, &rtc_state, !mc->no_floppy,
> +    pc_basic_device_init(isa_bus, x86ms->gsi, &rtc_state, !mc->no_floppy,
>                            (pcms->vmport != ON_OFF_AUTO_ON), pcms->pit_enabled,
>                            0xff0104);
>   
> @@ -331,7 +332,7 @@ static void pc_q35_init(MachineState *machine)
>   
>       if (machine->nvdimms_state->is_enabled) {
>           nvdimm_init_acpi_state(machine->nvdimms_state, system_io,
> -                               pcms->fw_cfg, OBJECT(pcms));
> +                               x86ms->fw_cfg, OBJECT(pcms));
>       }
>   }
>   
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 6807bb8a22..4a8e254d69 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -64,13 +64,14 @@ uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
>                                       unsigned int cpu_index)
>   {
>       MachineState *ms = MACHINE(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
> +    X86MachineClass *x86mc = X86_MACHINE_GET_CLASS(x86ms);
>       uint32_t correct_id;
>       static bool warned;
>   
> -    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
> +    correct_id = x86_apicid_from_cpu_idx(x86ms->smp_dies, ms->smp.cores,
>                                            ms->smp.threads, cpu_index);
> -    if (pcmc->compat_apic_id_mode) {
> +    if (x86mc->compat_apic_id_mode) {
>           if (cpu_index != correct_id && !warned && !qtest_enabled()) {
>               error_report("APIC IDs set in compatibility mode, "
>                            "CPU topology won't match the configuration");
> @@ -87,11 +88,12 @@ void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
>       Object *cpu = NULL;
>       Error *local_err = NULL;
>       CPUX86State *env = NULL;
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>   
>       cpu = object_new(MACHINE(pcms)->cpu_type);
>   
>       env = &X86_CPU(cpu)->env;
> -    env->nr_dies = pcms->smp_dies;
> +    env->nr_dies = x86ms->smp_dies;
>   
>       object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
>       object_property_set_bool(cpu, true, "realized", &local_err);
> @@ -107,6 +109,7 @@ void x86_cpus_init(PCMachineState *pcms)
>       MachineState *ms = MACHINE(pcms);
>       MachineClass *mc = MACHINE_GET_CLASS(pcms);
>       PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>   
>       x86_cpu_set_default_version(pcmc->default_cpu_version);
>   
> @@ -117,8 +120,8 @@ void x86_cpus_init(PCMachineState *pcms)
>        *
>        * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
>        */
> -    pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
> -                                                     ms->smp.max_cpus - 1) + 1;
> +    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
> +                                                      ms->smp.max_cpus - 1) + 1;
>       possible_cpus = mc->possible_cpu_arch_ids(ms);
>       for (i = 0; i < ms->smp.cpus; i++) {
>           x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
> @@ -138,11 +141,11 @@ x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>   int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
>   {
>      X86CPUTopoInfo topo;
> -   PCMachineState *pcms = PC_MACHINE(ms);
> +   X86MachineState *x86ms = X86_MACHINE(ms);
>   
>      assert(idx < ms->possible_cpus->len);
>      x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
> -                            pcms->smp_dies, ms->smp.cores,
> +                            x86ms->smp_dies, ms->smp.cores,
>                               ms->smp.threads, &topo);
>      return topo.pkg_id % ms->numa_state->num_nodes;
>   }
> @@ -150,6 +153,7 @@ int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
>   const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
>   {
>       PCMachineState *pcms = PC_MACHINE(ms);
> +    X86MachineState *x86ms = X86_MACHINE(ms);
>       int i;
>       unsigned int max_cpus = ms->smp.max_cpus;
>   
> @@ -172,11 +176,11 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
>           ms->possible_cpus->cpus[i].vcpus_count = 1;
>           ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
>           x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
> -                                 pcms->smp_dies, ms->smp.cores,
> +                                 x86ms->smp_dies, ms->smp.cores,
>                                    ms->smp.threads, &topo);
>           ms->possible_cpus->cpus[i].props.has_socket_id = true;
>           ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
> -        if (pcms->smp_dies > 1) {
> +        if (x86ms->smp_dies > 1) {
>               ms->possible_cpus->cpus[i].props.has_die_id = true;
>               ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
>           }
> @@ -188,6 +192,22 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
>       return ms->possible_cpus;
>   }
>   
> +static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
> +{
> +    /* cpu index isn't used */
> +    CPUState *cs;
> +
> +    CPU_FOREACH(cs) {
> +        X86CPU *cpu = X86_CPU(cs);
> +
> +        if (!cpu->apic_state) {
> +            cpu_interrupt(cs, CPU_INTERRUPT_NMI);
> +        } else {
> +            apic_deliver_nmi(cpu->apic_state);
> +        }
> +    }
> +}
> +
>   static long get_file_size(FILE *f)
>   {
>       long where, size;
> @@ -324,6 +344,7 @@ void x86_load_linux(PCMachineState *pcms,
>       char *vmode;
>       MachineState *machine = MACHINE(pcms);
>       PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>       struct setup_data *setup_data;
>       const char *kernel_filename = machine->kernel_filename;
>       const char *initrd_filename = machine->initrd_filename;
> @@ -392,11 +413,11 @@ void x86_load_linux(PCMachineState *pcms,
>                               initrd_filename, gerr->message);
>                       exit(1);
>                   }
> -                pcms->initrd_mapped_file = mapped_file;
> +                x86ms->initrd_mapped_file = mapped_file;
>   
>                   initrd_data = g_mapped_file_get_contents(mapped_file);
>                   initrd_size = g_mapped_file_get_length(mapped_file);
> -                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> +                initrd_max = x86ms->below_4g_mem_size - pcmc->acpi_data_size - 1;
>                   if (initrd_size >= initrd_max) {
>                       fprintf(stderr, "qemu: initrd is too large, cannot support."
>                               "(max: %"PRIu32", need %"PRId64")\n",
> @@ -474,8 +495,8 @@ void x86_load_linux(PCMachineState *pcms,
>           initrd_max = 0x37ffffff;
>       }
>   
> -    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
> -        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> +    if (initrd_max >= x86ms->below_4g_mem_size - pcmc->acpi_data_size) {
> +        initrd_max = x86ms->below_4g_mem_size - pcmc->acpi_data_size - 1;
>       }
>   
>       fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
> @@ -538,7 +559,7 @@ void x86_load_linux(PCMachineState *pcms,
>                       initrd_filename, gerr->message);
>               exit(1);
>           }
> -        pcms->initrd_mapped_file = mapped_file;
> +        x86ms->initrd_mapped_file = mapped_file;
>   
>           initrd_data = g_mapped_file_get_contents(mapped_file);
>           initrd_size = g_mapped_file_get_length(mapped_file);
> @@ -682,3 +703,91 @@ void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
>                                   (uint32_t)(-bios_size),
>                                   bios);
>   }
> +
> +static void x86_machine_get_max_ram_below_4g(Object *obj, Visitor *v,
> +                                             const char *name, void *opaque,
> +                                             Error **errp)
> +{
> +    X86MachineState *x86ms = X86_MACHINE(obj);
> +    uint64_t value = x86ms->max_ram_below_4g;
> +
> +    visit_type_size(v, name, &value, errp);
> +}
> +
> +static void x86_machine_set_max_ram_below_4g(Object *obj, Visitor *v,
> +                                             const char *name, void *opaque,
> +                                             Error **errp)
> +{
> +    X86MachineState *x86ms = X86_MACHINE(obj);
> +    Error *error = NULL;
> +    uint64_t value;
> +
> +    visit_type_size(v, name, &value, &error);
> +    if (error) {
> +        error_propagate(errp, error);
> +        return;
> +    }
> +    if (value > 4 * GiB) {
> +        error_setg(&error,
> +                   "Machine option 'max-ram-below-4g=%"PRIu64
> +                   "' expects size less than or equal to 4G", value);
> +        error_propagate(errp, error);
> +        return;
> +    }
> +
> +    if (value < 1 * MiB) {
> +        warn_report("Only %" PRIu64 " bytes of RAM below the 4GiB boundary,"
> +                    "BIOS may not work with less than 1MiB", value);
> +    }
> +
> +    x86ms->max_ram_below_4g = value;
> +}
> +
> +static void x86_machine_initfn(Object *obj)
> +{
> +    X86MachineState *x86ms = X86_MACHINE(obj);
> +
> +    x86ms->max_ram_below_4g = 0; /* use default */
> +    x86ms->smp_dies = 1;
> +}
> +
> +static void x86_machine_class_init(ObjectClass *oc, void *data)
> +{
> +    MachineClass *mc = MACHINE_CLASS(oc);
> +    X86MachineClass *x86mc = X86_MACHINE_CLASS(oc);
> +    NMIClass *nc = NMI_CLASS(oc);
> +
> +    mc->cpu_index_to_instance_props = x86_cpu_index_to_props;
> +    mc->get_default_cpu_node_id = x86_get_default_cpu_node_id;
> +    mc->possible_cpu_arch_ids = x86_possible_cpu_arch_ids;
> +    x86mc->compat_apic_id_mode = false;
> +    nc->nmi_monitor_handler = x86_nmi;
> +
> +    object_class_property_add(oc, X86_MACHINE_MAX_RAM_BELOW_4G, "size",
> +        x86_machine_get_max_ram_below_4g, x86_machine_set_max_ram_below_4g,
> +        NULL, NULL, &error_abort);
> +
> +    object_class_property_set_description(oc, X86_MACHINE_MAX_RAM_BELOW_4G,
> +        "Maximum ram below the 4G boundary (32bit boundary)", &error_abort);
> +}
> +
> +static const TypeInfo x86_machine_info = {
> +    .name = TYPE_X86_MACHINE,
> +    .parent = TYPE_MACHINE,
> +    .abstract = true,
> +    .instance_size = sizeof(X86MachineState),
> +    .instance_init = x86_machine_initfn,
> +    .class_size = sizeof(X86MachineClass),
> +    .class_init = x86_machine_class_init,
> +    .interfaces = (InterfaceInfo[]) {
> +         { TYPE_NMI },
> +         { }
> +    },
> +};
> +
> +static void x86_machine_register_types(void)
> +{
> +    type_register_static(&x86_machine_info);
> +}
> +
> +type_init(x86_machine_register_types)
> diff --git a/hw/i386/xen/xen-hvm.c b/hw/i386/xen/xen-hvm.c
> index 6b5e5bb7f5..f14d7bba4b 100644
> --- a/hw/i386/xen/xen-hvm.c
> +++ b/hw/i386/xen/xen-hvm.c
> @@ -197,10 +197,11 @@ qemu_irq *xen_interrupt_controller_init(void)
>   static void xen_ram_init(PCMachineState *pcms,
>                            ram_addr_t ram_size, MemoryRegion **ram_memory_p)
>   {
> +    X86MachineState *x86ms = X86_MACHINE(pcms);
>       MemoryRegion *sysmem = get_system_memory();
>       ram_addr_t block_len;
>       uint64_t user_lowmem = object_property_get_uint(qdev_get_machine(),
> -                                                    PC_MACHINE_MAX_RAM_BELOW_4G,
> +                                                    X86_MACHINE_MAX_RAM_BELOW_4G,
>                                                       &error_abort);
>   
>       /* Handle the machine opt max-ram-below-4g.  It is basically doing
> @@ -214,20 +215,20 @@ static void xen_ram_init(PCMachineState *pcms,
>       }
>   
>       if (ram_size >= user_lowmem) {
> -        pcms->above_4g_mem_size = ram_size - user_lowmem;
> -        pcms->below_4g_mem_size = user_lowmem;
> +        x86ms->above_4g_mem_size = ram_size - user_lowmem;
> +        x86ms->below_4g_mem_size = user_lowmem;
>       } else {
> -        pcms->above_4g_mem_size = 0;
> -        pcms->below_4g_mem_size = ram_size;
> +        x86ms->above_4g_mem_size = 0;
> +        x86ms->below_4g_mem_size = ram_size;
>       }
> -    if (!pcms->above_4g_mem_size) {
> +    if (!x86ms->above_4g_mem_size) {
>           block_len = ram_size;
>       } else {
>           /*
>            * Xen does not allocate the memory continuously, it keeps a
>            * hole of the size computed above or passed in.
>            */
> -        block_len = (1ULL << 32) + pcms->above_4g_mem_size;
> +        block_len = (1ULL << 32) + x86ms->above_4g_mem_size;
>       }
>       memory_region_init_ram(&ram_memory, NULL, "xen.ram", block_len,
>                              &error_fatal);
> @@ -244,12 +245,12 @@ static void xen_ram_init(PCMachineState *pcms,
>        */
>       memory_region_init_alias(&ram_lo, NULL, "xen.ram.lo",
>                                &ram_memory, 0xc0000,
> -                             pcms->below_4g_mem_size - 0xc0000);
> +                             x86ms->below_4g_mem_size - 0xc0000);
>       memory_region_add_subregion(sysmem, 0xc0000, &ram_lo);
> -    if (pcms->above_4g_mem_size > 0) {
> +    if (x86ms->above_4g_mem_size > 0) {
>           memory_region_init_alias(&ram_hi, NULL, "xen.ram.hi",
>                                    &ram_memory, 0x100000000ULL,
> -                                 pcms->above_4g_mem_size);
> +                                 x86ms->above_4g_mem_size);
>           memory_region_add_subregion(sysmem, 0x100000000ULL, &ram_hi);
>       }
>   }
> @@ -265,7 +266,7 @@ void xen_ram_alloc(ram_addr_t ram_addr, ram_addr_t size, MemoryRegion *mr,
>           /* RAM already populated in Xen */
>           fprintf(stderr, "%s: do not alloc "RAM_ADDR_FMT
>                   " bytes of ram at "RAM_ADDR_FMT" when runstate is INMIGRATE\n",
> -                __func__, size, ram_addr);
> +                __func__, size, ram_addr);
>           return;
>       }
>   
> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> index 1ede055387..ead14e1888 100644
> --- a/hw/intc/ioapic.c
> +++ b/hw/intc/ioapic.c
> @@ -89,7 +89,7 @@ static void ioapic_entry_parse(uint64_t entry, struct ioapic_entry_info *info)
>   
>   static void ioapic_service(IOAPICCommonState *s)
>   {
> -    AddressSpace *ioapic_as = PC_MACHINE(qdev_get_machine())->ioapic_as;
> +    AddressSpace *ioapic_as = X86_MACHINE(qdev_get_machine())->ioapic_as;
>       struct ioapic_entry_info info;
>       uint8_t i;
>       uint32_t mask;
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 05/10] hw/i386: make x86.c independent from PCMachineState
  2019-10-04  9:37 ` [PATCH v6 05/10] hw/i386: make x86.c independent from PCMachineState Sergio Lopez
@ 2019-10-04  9:51   ` Philippe Mathieu-Daudé
  0 siblings, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-10-04  9:51 UTC (permalink / raw)
  To: Sergio Lopez, qemu-devel
  Cc: ehabkost, mst, kraxel, pbonzini, imammedo, sgarzare, lersek, rth

On 10/4/19 11:37 AM, Sergio Lopez wrote:
> As a last step into splitting PCMachineState and deriving
> X86MachineState from it, make the functions previously extracted from
> pc.c to x86.c independent from PCMachineState, using X86MachineState
> instead.
> 
> Signed-off-by: Sergio Lopez <slp@redhat.com>

Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>

> ---
>   include/hw/i386/x86.h | 13 ++++++++----
>   hw/i386/pc.c          | 14 ++++++++-----
>   hw/i386/pc_piix.c     |  2 +-
>   hw/i386/pc_q35.c      |  2 +-
>   hw/i386/x86.c         | 49 ++++++++++++++++++++-----------------------
>   5 files changed, 43 insertions(+), 37 deletions(-)
> 
> diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
> index a930a7ad9d..f44359e9e9 100644
> --- a/include/hw/i386/x86.h
> +++ b/include/hw/i386/x86.h
> @@ -73,10 +73,11 @@ typedef struct {
>   #define X86_MACHINE_CLASS(class) \
>       OBJECT_CLASS_CHECK(X86MachineClass, class, TYPE_X86_MACHINE)
>   
> -uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
> +uint32_t x86_cpu_apic_id_from_index(X86MachineState *pcms,
>                                       unsigned int cpu_index);
> -void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp);
> -void x86_cpus_init(PCMachineState *pcms);
> +
> +void x86_cpu_new(X86MachineState *pcms, int64_t apic_id, Error **errp);
> +void x86_cpus_init(X86MachineState *pcms, int default_cpu_version);
>   CpuInstanceProperties x86_cpu_index_to_props(MachineState *ms,
>                                                unsigned cpu_index);
>   int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx);
> @@ -84,6 +85,10 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
>   
>   void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw);
>   
> -void x86_load_linux(PCMachineState *pcms, FWCfgState *fw_cfg);
> +void x86_load_linux(X86MachineState *x86ms,
> +                    FWCfgState *fw_cfg,
> +                    int acpi_data_size,
> +                    bool pvh_enabled,
> +                    bool linuxboot_dma_enabled);
>   
>   #endif
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 0dc1420a1f..0bf93d489c 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -982,8 +982,8 @@ void pc_smp_parse(MachineState *ms, QemuOpts *opts)
>   
>   void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
>   {
> -    PCMachineState *pcms = PC_MACHINE(ms);
> -    int64_t apic_id = x86_cpu_apic_id_from_index(pcms, id);
> +    X86MachineState *x86ms = X86_MACHINE(ms);
> +    int64_t apic_id = x86_cpu_apic_id_from_index(x86ms, id);
>       Error *local_err = NULL;
>   
>       if (id < 0) {
> @@ -998,7 +998,8 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
>           return;
>       }
>   
> -    x86_cpu_new(PC_MACHINE(ms), apic_id, &local_err);
> +
> +    x86_cpu_new(X86_MACHINE(ms), apic_id, &local_err);
>       if (local_err) {
>           error_propagate(errp, local_err);
>           return;
> @@ -1099,6 +1100,7 @@ void xen_load_linux(PCMachineState *pcms)
>   {
>       int i;
>       FWCfgState *fw_cfg;
> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
>       X86MachineState *x86ms = X86_MACHINE(pcms);
>   
>       assert(MACHINE(pcms)->kernel_filename != NULL);
> @@ -1107,7 +1109,8 @@ void xen_load_linux(PCMachineState *pcms)
>       fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, x86ms->boot_cpus);
>       rom_set_fw(fw_cfg);
>   
> -    x86_load_linux(pcms, fw_cfg);
> +    x86_load_linux(x86ms, fw_cfg, pcmc->acpi_data_size,
> +                   pcmc->pvh_enabled, pcmc->linuxboot_dma_enabled);
>       for (i = 0; i < nb_option_roms; i++) {
>           assert(!strcmp(option_rom[i].name, "linuxboot.bin") ||
>                  !strcmp(option_rom[i].name, "linuxboot_dma.bin") ||
> @@ -1243,7 +1246,8 @@ void pc_memory_init(PCMachineState *pcms,
>       }
>   
>       if (linux_boot) {
> -        x86_load_linux(pcms, fw_cfg);
> +        x86_load_linux(x86ms, fw_cfg, pcmc->acpi_data_size,
> +                       pcmc->pvh_enabled, pcmc->linuxboot_dma_enabled);
>       }
>   
>       for (i = 0; i < nb_option_roms; i++) {
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index 0afa8fe6ea..a86317cdff 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -154,7 +154,7 @@ static void pc_init1(MachineState *machine,
>           }
>       }
>   
> -    x86_cpus_init(pcms);
> +    x86_cpus_init(x86ms, pcmc->default_cpu_version);
>   
>       if (kvm_enabled() && pcmc->kvmclock_enabled) {
>           kvmclock_create();
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 8e7beb9415..8bdca373d6 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -181,7 +181,7 @@ static void pc_q35_init(MachineState *machine)
>           xen_hvm_init(pcms, &ram_memory);
>       }
>   
> -    x86_cpus_init(pcms);
> +    x86_cpus_init(x86ms, pcmc->default_cpu_version);
>   
>       kvmclock_create();
>   
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> index 4a8e254d69..55944a9a02 100644
> --- a/hw/i386/x86.c
> +++ b/hw/i386/x86.c
> @@ -60,11 +60,10 @@ static size_t pvh_start_addr;
>    * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
>    * all CPUs up to max_cpus.
>    */
> -uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
> +uint32_t x86_cpu_apic_id_from_index(X86MachineState *x86ms,
>                                       unsigned int cpu_index)
>   {
> -    MachineState *ms = MACHINE(pcms);
> -    X86MachineState *x86ms = X86_MACHINE(pcms);
> +    MachineState *ms = MACHINE(x86ms);
>       X86MachineClass *x86mc = X86_MACHINE_GET_CLASS(x86ms);
>       uint32_t correct_id;
>       static bool warned;
> @@ -83,14 +82,14 @@ uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
>       }
>   }
>   
> -void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
> +
> +void x86_cpu_new(X86MachineState *x86ms, int64_t apic_id, Error **errp)
>   {
>       Object *cpu = NULL;
>       Error *local_err = NULL;
>       CPUX86State *env = NULL;
> -    X86MachineState *x86ms = X86_MACHINE(pcms);
>   
> -    cpu = object_new(MACHINE(pcms)->cpu_type);
> +    cpu = object_new(MACHINE(x86ms)->cpu_type);
>   
>       env = &X86_CPU(cpu)->env;
>       env->nr_dies = x86ms->smp_dies;
> @@ -102,16 +101,14 @@ void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
>       error_propagate(errp, local_err);
>   }
>   
> -void x86_cpus_init(PCMachineState *pcms)
> +void x86_cpus_init(X86MachineState *x86ms, int default_cpu_version)
>   {
>       int i;
>       const CPUArchIdList *possible_cpus;
> -    MachineState *ms = MACHINE(pcms);
> -    MachineClass *mc = MACHINE_GET_CLASS(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
> -    X86MachineState *x86ms = X86_MACHINE(pcms);
> +    MachineState *ms = MACHINE(x86ms);
> +    MachineClass *mc = MACHINE_GET_CLASS(x86ms);
>   
> -    x86_cpu_set_default_version(pcmc->default_cpu_version);
> +    x86_cpu_set_default_version(default_cpu_version);
>   
>       /* Calculates the limit to CPU APIC ID values
>        *
> @@ -120,11 +117,11 @@ void x86_cpus_init(PCMachineState *pcms)
>        *
>        * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
>        */
> -    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
> +    x86ms->apic_id_limit = x86_cpu_apic_id_from_index(x86ms,
>                                                         ms->smp.max_cpus - 1) + 1;
>       possible_cpus = mc->possible_cpu_arch_ids(ms);
>       for (i = 0; i < ms->smp.cpus; i++) {
> -        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
> +        x86_cpu_new(x86ms, possible_cpus->cpus[i].arch_id, &error_fatal);
>       }
>   }
>   
> @@ -152,7 +149,6 @@ int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
>   
>   const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
>   {
> -    PCMachineState *pcms = PC_MACHINE(ms);
>       X86MachineState *x86ms = X86_MACHINE(ms);
>       int i;
>       unsigned int max_cpus = ms->smp.max_cpus;
> @@ -174,7 +170,7 @@ const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
>   
>           ms->possible_cpus->cpus[i].type = ms->cpu_type;
>           ms->possible_cpus->cpus[i].vcpus_count = 1;
> -        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
> +        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(x86ms, i);
>           x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
>                                    x86ms->smp_dies, ms->smp.cores,
>                                    ms->smp.threads, &topo);
> @@ -331,8 +327,11 @@ static bool load_elfboot(const char *kernel_filename,
>       return true;
>   }
>   
> -void x86_load_linux(PCMachineState *pcms,
> -                    FWCfgState *fw_cfg)
> +void x86_load_linux(X86MachineState *x86ms,
> +                    FWCfgState *fw_cfg,
> +                    int acpi_data_size,
> +                    bool pvh_enabled,
> +                    bool linuxboot_dma_enabled)
>   {
>       uint16_t protocol;
>       int setup_size, kernel_size, cmdline_size;
> @@ -342,9 +341,7 @@ void x86_load_linux(PCMachineState *pcms,
>       hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
>       FILE *f;
>       char *vmode;
> -    MachineState *machine = MACHINE(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> -    X86MachineState *x86ms = X86_MACHINE(pcms);
> +    MachineState *machine = MACHINE(x86ms);
>       struct setup_data *setup_data;
>       const char *kernel_filename = machine->kernel_filename;
>       const char *initrd_filename = machine->initrd_filename;
> @@ -387,7 +384,7 @@ void x86_load_linux(PCMachineState *pcms,
>            * saving the PVH entry point used by the x86/HVM direct boot ABI.
>            * If load_elfboot() is successful, populate the fw_cfg info.
>            */
> -        if (pcmc->pvh_enabled &&
> +        if (pvh_enabled &&
>               load_elfboot(kernel_filename, kernel_size,
>                            header, pvh_start_addr, fw_cfg)) {
>               fclose(f);
> @@ -417,7 +414,7 @@ void x86_load_linux(PCMachineState *pcms,
>   
>                   initrd_data = g_mapped_file_get_contents(mapped_file);
>                   initrd_size = g_mapped_file_get_length(mapped_file);
> -                initrd_max = x86ms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> +                initrd_max = x86ms->below_4g_mem_size - acpi_data_size - 1;
>                   if (initrd_size >= initrd_max) {
>                       fprintf(stderr, "qemu: initrd is too large, cannot support."
>                               "(max: %"PRIu32", need %"PRId64")\n",
> @@ -495,8 +492,8 @@ void x86_load_linux(PCMachineState *pcms,
>           initrd_max = 0x37ffffff;
>       }
>   
> -    if (initrd_max >= x86ms->below_4g_mem_size - pcmc->acpi_data_size) {
> -        initrd_max = x86ms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> +    if (initrd_max >= x86ms->below_4g_mem_size - acpi_data_size) {
> +        initrd_max = x86ms->below_4g_mem_size - acpi_data_size - 1;
>       }
>   
>       fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
> @@ -645,7 +642,7 @@ void x86_load_linux(PCMachineState *pcms,
>   
>       option_rom[nb_option_roms].bootindex = 0;
>       option_rom[nb_option_roms].name = "linuxboot.bin";
> -    if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
> +    if (linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
>           option_rom[nb_option_roms].name = "linuxboot_dma.bin";
>       }
>       nb_option_roms++;
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them
  2019-10-04  9:37 ` [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them Sergio Lopez
  2019-10-04  9:46   ` Philippe Mathieu-Daudé
@ 2019-10-04 11:23   ` Stefano Garzarella
  2019-10-04 11:36   ` Stefano Garzarella
  2 siblings, 0 replies; 31+ messages in thread
From: Stefano Garzarella @ 2019-10-04 11:23 UTC (permalink / raw)
  To: Sergio Lopez
  Cc: ehabkost, mst, lersek, qemu-devel, kraxel, pbonzini, imammedo,
	philmd, rth

On Fri, Oct 04, 2019 at 11:37:45AM +0200, Sergio Lopez wrote:
> Move x86 functions that will be shared between PC and non-PC machine
> types to x86.c, along with their helpers.
> 
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
>  include/hw/i386/pc.h  |   1 -
>  include/hw/i386/x86.h |  35 +++
>  hw/i386/pc.c          | 582 +----------------------------------
>  hw/i386/pc_piix.c     |   1 +
>  hw/i386/pc_q35.c      |   1 +
>  hw/i386/pc_sysfw.c    |  54 +---
>  hw/i386/x86.c         | 684 ++++++++++++++++++++++++++++++++++++++++++
>  hw/i386/Makefile.objs |   1 +
>  8 files changed, 724 insertions(+), 635 deletions(-)
>  create mode 100644 include/hw/i386/x86.h
>  create mode 100644 hw/i386/x86.c
> 

As we discuessed, PVH functions in x86.c make sense to me:

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

Thanks,
Stefano


> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index d12f42e9e5..73e2847e87 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -195,7 +195,6 @@ bool pc_machine_is_smm_enabled(PCMachineState *pcms);
>  void pc_register_ferr_irq(qemu_irq irq);
>  void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
>  
> -void x86_cpus_init(PCMachineState *pcms);
>  void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp);
>  void pc_smp_parse(MachineState *ms, QemuOpts *opts);
>  
> diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
> new file mode 100644
> index 0000000000..71e2b6985d
> --- /dev/null
> +++ b/include/hw/i386/x86.h
> @@ -0,0 +1,35 @@
> +/*
> + * Copyright (c) 2019 Red Hat, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_I386_X86_H
> +#define HW_I386_X86_H
> +
> +#include "hw/boards.h"
> +
> +uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
> +                                    unsigned int cpu_index);
> +void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp);
> +void x86_cpus_init(PCMachineState *pcms);
> +CpuInstanceProperties x86_cpu_index_to_props(MachineState *ms,
> +                                             unsigned cpu_index);
> +int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx);
> +const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
> +
> +void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw);
> +
> +void x86_load_linux(PCMachineState *x86ms, FWCfgState *fw_cfg);
> +
> +#endif
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index fd08c6704b..094db79fb0 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -24,6 +24,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "qemu/units.h"
> +#include "hw/i386/x86.h"
>  #include "hw/i386/pc.h"
>  #include "hw/char/serial.h"
>  #include "hw/char/parallel.h"
> @@ -102,9 +103,6 @@
>  
>  struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
>  
> -/* Physical Address of PVH entry point read from kernel ELF NOTE */
> -static size_t pvh_start_addr;
> -
>  GlobalProperty pc_compat_4_1[] = {};
>  const size_t pc_compat_4_1_len = G_N_ELEMENTS(pc_compat_4_1);
>  
> @@ -866,478 +864,6 @@ static void handle_a20_line_change(void *opaque, int irq, int level)
>      x86_cpu_set_a20(cpu, level);
>  }
>  
> -/* Calculates initial APIC ID for a specific CPU index
> - *
> - * Currently we need to be able to calculate the APIC ID from the CPU index
> - * alone (without requiring a CPU object), as the QEMU<->Seabios interfaces have
> - * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
> - * all CPUs up to max_cpus.
> - */
> -static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
> -                                           unsigned int cpu_index)
> -{
> -    MachineState *ms = MACHINE(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> -    uint32_t correct_id;
> -    static bool warned;
> -
> -    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
> -                                         ms->smp.threads, cpu_index);
> -    if (pcmc->compat_apic_id_mode) {
> -        if (cpu_index != correct_id && !warned && !qtest_enabled()) {
> -            error_report("APIC IDs set in compatibility mode, "
> -                         "CPU topology won't match the configuration");
> -            warned = true;
> -        }
> -        return cpu_index;
> -    } else {
> -        return correct_id;
> -    }
> -}
> -
> -static long get_file_size(FILE *f)
> -{
> -    long where, size;
> -
> -    /* XXX: on Unix systems, using fstat() probably makes more sense */
> -
> -    where = ftell(f);
> -    fseek(f, 0, SEEK_END);
> -    size = ftell(f);
> -    fseek(f, where, SEEK_SET);
> -
> -    return size;
> -}
> -
> -struct setup_data {
> -    uint64_t next;
> -    uint32_t type;
> -    uint32_t len;
> -    uint8_t data[0];
> -} __attribute__((packed));
> -
> -
> -/*
> - * The entry point into the kernel for PVH boot is different from
> - * the native entry point.  The PVH entry is defined by the x86/HVM
> - * direct boot ABI and is available in an ELFNOTE in the kernel binary.
> - *
> - * This function is passed to load_elf() when it is called from
> - * load_elfboot() which then additionally checks for an ELF Note of
> - * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
> - * parse the PVH entry address from the ELF Note.
> - *
> - * Due to trickery in elf_opts.h, load_elf() is actually available as
> - * load_elf32() or load_elf64() and this routine needs to be able
> - * to deal with being called as 32 or 64 bit.
> - *
> - * The address of the PVH entry point is saved to the 'pvh_start_addr'
> - * global variable.  (although the entry point is 32-bit, the kernel
> - * binary can be either 32-bit or 64-bit).
> - */
> -static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
> -{
> -    size_t *elf_note_data_addr;
> -
> -    /* Check if ELF Note header passed in is valid */
> -    if (arg1 == NULL) {
> -        return 0;
> -    }
> -
> -    if (is64) {
> -        struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
> -        uint64_t nhdr_size64 = sizeof(struct elf64_note);
> -        uint64_t phdr_align = *(uint64_t *)arg2;
> -        uint64_t nhdr_namesz = nhdr64->n_namesz;
> -
> -        elf_note_data_addr =
> -            ((void *)nhdr64) + nhdr_size64 +
> -            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
> -    } else {
> -        struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
> -        uint32_t nhdr_size32 = sizeof(struct elf32_note);
> -        uint32_t phdr_align = *(uint32_t *)arg2;
> -        uint32_t nhdr_namesz = nhdr32->n_namesz;
> -
> -        elf_note_data_addr =
> -            ((void *)nhdr32) + nhdr_size32 +
> -            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
> -    }
> -
> -    pvh_start_addr = *elf_note_data_addr;
> -
> -    return pvh_start_addr;
> -}
> -
> -static bool load_elfboot(const char *kernel_filename,
> -                   int kernel_file_size,
> -                   uint8_t *header,
> -                   size_t pvh_xen_start_addr,
> -                   FWCfgState *fw_cfg)
> -{
> -    uint32_t flags = 0;
> -    uint32_t mh_load_addr = 0;
> -    uint32_t elf_kernel_size = 0;
> -    uint64_t elf_entry;
> -    uint64_t elf_low, elf_high;
> -    int kernel_size;
> -
> -    if (ldl_p(header) != 0x464c457f) {
> -        return false; /* no elfboot */
> -    }
> -
> -    bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
> -    flags = elf_is64 ?
> -        ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
> -
> -    if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
> -        error_report("elfboot unsupported flags = %x", flags);
> -        exit(1);
> -    }
> -
> -    uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
> -    kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
> -                           NULL, &elf_note_type, &elf_entry,
> -                           &elf_low, &elf_high, 0, I386_ELF_MACHINE,
> -                           0, 0);
> -
> -    if (kernel_size < 0) {
> -        error_report("Error while loading elf kernel");
> -        exit(1);
> -    }
> -    mh_load_addr = elf_low;
> -    elf_kernel_size = elf_high - elf_low;
> -
> -    if (pvh_start_addr == 0) {
> -        error_report("Error loading uncompressed kernel without PVH ELF Note");
> -        exit(1);
> -    }
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
> -
> -    return true;
> -}
> -
> -static void x86_load_linux(PCMachineState *pcms,
> -                           FWCfgState *fw_cfg)
> -{
> -    uint16_t protocol;
> -    int setup_size, kernel_size, cmdline_size;
> -    int dtb_size, setup_data_offset;
> -    uint32_t initrd_max;
> -    uint8_t header[8192], *setup, *kernel;
> -    hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
> -    FILE *f;
> -    char *vmode;
> -    MachineState *machine = MACHINE(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> -    struct setup_data *setup_data;
> -    const char *kernel_filename = machine->kernel_filename;
> -    const char *initrd_filename = machine->initrd_filename;
> -    const char *dtb_filename = machine->dtb;
> -    const char *kernel_cmdline = machine->kernel_cmdline;
> -
> -    /* Align to 16 bytes as a paranoia measure */
> -    cmdline_size = (strlen(kernel_cmdline)+16) & ~15;
> -
> -    /* load the kernel header */
> -    f = fopen(kernel_filename, "rb");
> -    if (!f || !(kernel_size = get_file_size(f)) ||
> -        fread(header, 1, MIN(ARRAY_SIZE(header), kernel_size), f) !=
> -        MIN(ARRAY_SIZE(header), kernel_size)) {
> -        fprintf(stderr, "qemu: could not load kernel '%s': %s\n",
> -                kernel_filename, strerror(errno));
> -        exit(1);
> -    }
> -
> -    /* kernel protocol version */
> -#if 0
> -    fprintf(stderr, "header magic: %#x\n", ldl_p(header+0x202));
> -#endif
> -    if (ldl_p(header+0x202) == 0x53726448) {
> -        protocol = lduw_p(header+0x206);
> -    } else {
> -        /*
> -         * This could be a multiboot kernel. If it is, let's stop treating it
> -         * like a Linux kernel.
> -         * Note: some multiboot images could be in the ELF format (the same of
> -         * PVH), so we try multiboot first since we check the multiboot magic
> -         * header before to load it.
> -         */
> -        if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
> -                           kernel_cmdline, kernel_size, header)) {
> -            return;
> -        }
> -        /*
> -         * Check if the file is an uncompressed kernel file (ELF) and load it,
> -         * saving the PVH entry point used by the x86/HVM direct boot ABI.
> -         * If load_elfboot() is successful, populate the fw_cfg info.
> -         */
> -        if (pcmc->pvh_enabled &&
> -            load_elfboot(kernel_filename, kernel_size,
> -                         header, pvh_start_addr, fw_cfg)) {
> -            fclose(f);
> -
> -            fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
> -                strlen(kernel_cmdline) + 1);
> -            fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
> -
> -            fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
> -            fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
> -                             header, sizeof(header));
> -
> -            /* load initrd */
> -            if (initrd_filename) {
> -                GMappedFile *mapped_file;
> -                gsize initrd_size;
> -                gchar *initrd_data;
> -                GError *gerr = NULL;
> -
> -                mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
> -                if (!mapped_file) {
> -                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> -                            initrd_filename, gerr->message);
> -                    exit(1);
> -                }
> -                pcms->initrd_mapped_file = mapped_file;
> -
> -                initrd_data = g_mapped_file_get_contents(mapped_file);
> -                initrd_size = g_mapped_file_get_length(mapped_file);
> -                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> -                if (initrd_size >= initrd_max) {
> -                    fprintf(stderr, "qemu: initrd is too large, cannot support."
> -                            "(max: %"PRIu32", need %"PRId64")\n",
> -                            initrd_max, (uint64_t)initrd_size);
> -                    exit(1);
> -                }
> -
> -                initrd_addr = (initrd_max - initrd_size) & ~4095;
> -
> -                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> -                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> -                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
> -                                 initrd_size);
> -            }
> -
> -            option_rom[nb_option_roms].bootindex = 0;
> -            option_rom[nb_option_roms].name = "pvh.bin";
> -            nb_option_roms++;
> -
> -            return;
> -        }
> -        protocol = 0;
> -    }
> -
> -    if (protocol < 0x200 || !(header[0x211] & 0x01)) {
> -        /* Low kernel */
> -        real_addr    = 0x90000;
> -        cmdline_addr = 0x9a000 - cmdline_size;
> -        prot_addr    = 0x10000;
> -    } else if (protocol < 0x202) {
> -        /* High but ancient kernel */
> -        real_addr    = 0x90000;
> -        cmdline_addr = 0x9a000 - cmdline_size;
> -        prot_addr    = 0x100000;
> -    } else {
> -        /* High and recent kernel */
> -        real_addr    = 0x10000;
> -        cmdline_addr = 0x20000;
> -        prot_addr    = 0x100000;
> -    }
> -
> -#if 0
> -    fprintf(stderr,
> -            "qemu: real_addr     = 0x" TARGET_FMT_plx "\n"
> -            "qemu: cmdline_addr  = 0x" TARGET_FMT_plx "\n"
> -            "qemu: prot_addr     = 0x" TARGET_FMT_plx "\n",
> -            real_addr,
> -            cmdline_addr,
> -            prot_addr);
> -#endif
> -
> -    /* highest address for loading the initrd */
> -    if (protocol >= 0x20c &&
> -        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
> -        /*
> -         * Linux has supported initrd up to 4 GB for a very long time (2007,
> -         * long before XLF_CAN_BE_LOADED_ABOVE_4G which was added in 2013),
> -         * though it only sets initrd_max to 2 GB to "work around bootloader
> -         * bugs". Luckily, QEMU firmware(which does something like bootloader)
> -         * has supported this.
> -         *
> -         * It's believed that if XLF_CAN_BE_LOADED_ABOVE_4G is set, initrd can
> -         * be loaded into any address.
> -         *
> -         * In addition, initrd_max is uint32_t simply because QEMU doesn't
> -         * support the 64-bit boot protocol (specifically the ext_ramdisk_image
> -         * field).
> -         *
> -         * Therefore here just limit initrd_max to UINT32_MAX simply as well.
> -         */
> -        initrd_max = UINT32_MAX;
> -    } else if (protocol >= 0x203) {
> -        initrd_max = ldl_p(header+0x22c);
> -    } else {
> -        initrd_max = 0x37ffffff;
> -    }
> -
> -    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
> -        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> -    }
> -
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline)+1);
> -    fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
> -
> -    if (protocol >= 0x202) {
> -        stl_p(header+0x228, cmdline_addr);
> -    } else {
> -        stw_p(header+0x20, 0xA33F);
> -        stw_p(header+0x22, cmdline_addr-real_addr);
> -    }
> -
> -    /* handle vga= parameter */
> -    vmode = strstr(kernel_cmdline, "vga=");
> -    if (vmode) {
> -        unsigned int video_mode;
> -        /* skip "vga=" */
> -        vmode += 4;
> -        if (!strncmp(vmode, "normal", 6)) {
> -            video_mode = 0xffff;
> -        } else if (!strncmp(vmode, "ext", 3)) {
> -            video_mode = 0xfffe;
> -        } else if (!strncmp(vmode, "ask", 3)) {
> -            video_mode = 0xfffd;
> -        } else {
> -            video_mode = strtol(vmode, NULL, 0);
> -        }
> -        stw_p(header+0x1fa, video_mode);
> -    }
> -
> -    /* loader type */
> -    /* High nybble = B reserved for QEMU; low nybble is revision number.
> -       If this code is substantially changed, you may want to consider
> -       incrementing the revision. */
> -    if (protocol >= 0x200) {
> -        header[0x210] = 0xB0;
> -    }
> -    /* heap */
> -    if (protocol >= 0x201) {
> -        header[0x211] |= 0x80;	/* CAN_USE_HEAP */
> -        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
> -    }
> -
> -    /* load initrd */
> -    if (initrd_filename) {
> -        GMappedFile *mapped_file;
> -        gsize initrd_size;
> -        gchar *initrd_data;
> -        GError *gerr = NULL;
> -
> -        if (protocol < 0x200) {
> -            fprintf(stderr, "qemu: linux kernel too old to load a ram disk\n");
> -            exit(1);
> -        }
> -
> -        mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
> -        if (!mapped_file) {
> -            fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> -                    initrd_filename, gerr->message);
> -            exit(1);
> -        }
> -        pcms->initrd_mapped_file = mapped_file;
> -
> -        initrd_data = g_mapped_file_get_contents(mapped_file);
> -        initrd_size = g_mapped_file_get_length(mapped_file);
> -        if (initrd_size >= initrd_max) {
> -            fprintf(stderr, "qemu: initrd is too large, cannot support."
> -                    "(max: %"PRIu32", need %"PRId64")\n",
> -                    initrd_max, (uint64_t)initrd_size);
> -            exit(1);
> -        }
> -
> -        initrd_addr = (initrd_max-initrd_size) & ~4095;
> -
> -        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> -        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> -        fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
> -
> -        stl_p(header+0x218, initrd_addr);
> -        stl_p(header+0x21c, initrd_size);
> -    }
> -
> -    /* load kernel and setup */
> -    setup_size = header[0x1f1];
> -    if (setup_size == 0) {
> -        setup_size = 4;
> -    }
> -    setup_size = (setup_size+1)*512;
> -    if (setup_size > kernel_size) {
> -        fprintf(stderr, "qemu: invalid kernel header\n");
> -        exit(1);
> -    }
> -    kernel_size -= setup_size;
> -
> -    setup  = g_malloc(setup_size);
> -    kernel = g_malloc(kernel_size);
> -    fseek(f, 0, SEEK_SET);
> -    if (fread(setup, 1, setup_size, f) != setup_size) {
> -        fprintf(stderr, "fread() failed\n");
> -        exit(1);
> -    }
> -    if (fread(kernel, 1, kernel_size, f) != kernel_size) {
> -        fprintf(stderr, "fread() failed\n");
> -        exit(1);
> -    }
> -    fclose(f);
> -
> -    /* append dtb to kernel */
> -    if (dtb_filename) {
> -        if (protocol < 0x209) {
> -            fprintf(stderr, "qemu: Linux kernel too old to load a dtb\n");
> -            exit(1);
> -        }
> -
> -        dtb_size = get_image_size(dtb_filename);
> -        if (dtb_size <= 0) {
> -            fprintf(stderr, "qemu: error reading dtb %s: %s\n",
> -                    dtb_filename, strerror(errno));
> -            exit(1);
> -        }
> -
> -        setup_data_offset = QEMU_ALIGN_UP(kernel_size, 16);
> -        kernel_size = setup_data_offset + sizeof(struct setup_data) + dtb_size;
> -        kernel = g_realloc(kernel, kernel_size);
> -
> -        stq_p(header+0x250, prot_addr + setup_data_offset);
> -
> -        setup_data = (struct setup_data *)(kernel + setup_data_offset);
> -        setup_data->next = 0;
> -        setup_data->type = cpu_to_le32(SETUP_DTB);
> -        setup_data->len = cpu_to_le32(dtb_size);
> -
> -        load_image_size(dtb_filename, setup_data->data, dtb_size);
> -    }
> -
> -    memcpy(setup, header, MIN(sizeof(header), setup_size));
> -
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, prot_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, kernel_size);
> -    fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, kernel, kernel_size);
> -
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_ADDR, real_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
> -    fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
> -
> -    option_rom[nb_option_roms].bootindex = 0;
> -    option_rom[nb_option_roms].name = "linuxboot.bin";
> -    if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
> -        option_rom[nb_option_roms].name = "linuxboot_dma.bin";
> -    }
> -    nb_option_roms++;
> -}
> -
>  #define NE2000_NB_MAX 6
>  
>  static const int ne2000_io[NE2000_NB_MAX] = { 0x300, 0x320, 0x340, 0x360,
> @@ -1374,24 +900,6 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
>      }
>  }
>  
> -static void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
> -{
> -    Object *cpu = NULL;
> -    Error *local_err = NULL;
> -    CPUX86State *env = NULL;
> -
> -    cpu = object_new(MACHINE(pcms)->cpu_type);
> -
> -    env = &X86_CPU(cpu)->env;
> -    env->nr_dies = pcms->smp_dies;
> -
> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
> -    object_property_set_bool(cpu, true, "realized", &local_err);
> -
> -    object_unref(cpu);
> -    error_propagate(errp, local_err);
> -}
> -
>  /*
>   * This function is very similar to smp_parse()
>   * in hw/core/machine.c but includes CPU die support.
> @@ -1497,31 +1005,6 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
>      }
>  }
>  
> -void x86_cpus_init(PCMachineState *pcms)
> -{
> -    int i;
> -    const CPUArchIdList *possible_cpus;
> -    MachineState *ms = MACHINE(pcms);
> -    MachineClass *mc = MACHINE_GET_CLASS(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
> -
> -    x86_cpu_set_default_version(pcmc->default_cpu_version);
> -
> -    /* Calculates the limit to CPU APIC ID values
> -     *
> -     * Limit for the APIC ID value, so that all
> -     * CPU APIC IDs are < pcms->apic_id_limit.
> -     *
> -     * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
> -     */
> -    pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
> -                                                     ms->smp.max_cpus - 1) + 1;
> -    possible_cpus = mc->possible_cpu_arch_ids(ms);
> -    for (i = 0; i < ms->smp.cpus; i++) {
> -        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
> -    }
> -}
> -
>  static void rtc_set_cpus_count(ISADevice *rtc, uint16_t cpus_count)
>  {
>      if (cpus_count > 0xff) {
> @@ -2677,69 +2160,6 @@ static void pc_machine_wakeup(MachineState *machine)
>      cpu_synchronize_all_post_reset();
>  }
>  
> -static CpuInstanceProperties
> -x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> -{
> -    MachineClass *mc = MACHINE_GET_CLASS(ms);
> -    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> -
> -    assert(cpu_index < possible_cpus->len);
> -    return possible_cpus->cpus[cpu_index].props;
> -}
> -
> -static int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
> -{
> -   X86CPUTopoInfo topo;
> -   PCMachineState *pcms = PC_MACHINE(ms);
> -
> -   assert(idx < ms->possible_cpus->len);
> -   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
> -                            pcms->smp_dies, ms->smp.cores,
> -                            ms->smp.threads, &topo);
> -   return topo.pkg_id % ms->numa_state->num_nodes;
> -}
> -
> -static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
> -{
> -    PCMachineState *pcms = PC_MACHINE(ms);
> -    int i;
> -    unsigned int max_cpus = ms->smp.max_cpus;
> -
> -    if (ms->possible_cpus) {
> -        /*
> -         * make sure that max_cpus hasn't changed since the first use, i.e.
> -         * -smp hasn't been parsed after it
> -        */
> -        assert(ms->possible_cpus->len == max_cpus);
> -        return ms->possible_cpus;
> -    }
> -
> -    ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
> -                                  sizeof(CPUArchId) * max_cpus);
> -    ms->possible_cpus->len = max_cpus;
> -    for (i = 0; i < ms->possible_cpus->len; i++) {
> -        X86CPUTopoInfo topo;
> -
> -        ms->possible_cpus->cpus[i].type = ms->cpu_type;
> -        ms->possible_cpus->cpus[i].vcpus_count = 1;
> -        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
> -        x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
> -                                 pcms->smp_dies, ms->smp.cores,
> -                                 ms->smp.threads, &topo);
> -        ms->possible_cpus->cpus[i].props.has_socket_id = true;
> -        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
> -        if (pcms->smp_dies > 1) {
> -            ms->possible_cpus->cpus[i].props.has_die_id = true;
> -            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
> -        }
> -        ms->possible_cpus->cpus[i].props.has_core_id = true;
> -        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
> -        ms->possible_cpus->cpus[i].props.has_thread_id = true;
> -        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
> -    }
> -    return ms->possible_cpus;
> -}
> -
>  static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
>  {
>      /* cpu index isn't used */
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index de09e076cd..1396451abf 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -27,6 +27,7 @@
>  
>  #include "qemu/units.h"
>  #include "hw/loader.h"
> +#include "hw/i386/x86.h"
>  #include "hw/i386/pc.h"
>  #include "hw/i386/apic.h"
>  #include "hw/display/ramfb.h"
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 894989b64e..8920bd8978 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -41,6 +41,7 @@
>  #include "hw/pci-host/q35.h"
>  #include "hw/qdev-properties.h"
>  #include "exec/address-spaces.h"
> +#include "hw/i386/x86.h"
>  #include "hw/i386/pc.h"
>  #include "hw/i386/ich9.h"
>  #include "hw/i386/amd_iommu.h"
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index 28cb1f63c9..69b79851be 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -31,6 +31,7 @@
>  #include "qemu/option.h"
>  #include "qemu/units.h"
>  #include "hw/sysbus.h"
> +#include "hw/i386/x86.h"
>  #include "hw/i386/pc.h"
>  #include "hw/loader.h"
>  #include "hw/qdev-properties.h"
> @@ -211,59 +212,6 @@ static void pc_system_flash_map(PCMachineState *pcms,
>      }
>  }
>  
> -static void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
> -{
> -    char *filename;
> -    MemoryRegion *bios, *isa_bios;
> -    int bios_size, isa_bios_size;
> -    int ret;
> -
> -    /* BIOS load */
> -    if (bios_name == NULL) {
> -        bios_name = BIOS_FILENAME;
> -    }
> -    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
> -    if (filename) {
> -        bios_size = get_image_size(filename);
> -    } else {
> -        bios_size = -1;
> -    }
> -    if (bios_size <= 0 ||
> -        (bios_size % 65536) != 0) {
> -        goto bios_error;
> -    }
> -    bios = g_malloc(sizeof(*bios));
> -    memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
> -    if (!isapc_ram_fw) {
> -        memory_region_set_readonly(bios, true);
> -    }
> -    ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
> -    if (ret != 0) {
> -    bios_error:
> -        fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
> -        exit(1);
> -    }
> -    g_free(filename);
> -
> -    /* map the last 128KB of the BIOS in ISA space */
> -    isa_bios_size = MIN(bios_size, 128 * KiB);
> -    isa_bios = g_malloc(sizeof(*isa_bios));
> -    memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
> -                             bios_size - isa_bios_size, isa_bios_size);
> -    memory_region_add_subregion_overlap(rom_memory,
> -                                        0x100000 - isa_bios_size,
> -                                        isa_bios,
> -                                        1);
> -    if (!isapc_ram_fw) {
> -        memory_region_set_readonly(isa_bios, true);
> -    }
> -
> -    /* map all the bios at the top of memory */
> -    memory_region_add_subregion(rom_memory,
> -                                (uint32_t)(-bios_size),
> -                                bios);
> -}
> -
>  void pc_system_firmware_init(PCMachineState *pcms,
>                               MemoryRegion *rom_memory)
>  {
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> new file mode 100644
> index 0000000000..6807bb8a22
> --- /dev/null
> +++ b/hw/i386/x86.c
> @@ -0,0 +1,684 @@
> +/*
> + * Copyright (c) 2003-2004 Fabrice Bellard
> + * Copyright (c) 2019 Red Hat, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +#include "qemu/osdep.h"
> +#include "qemu/error-report.h"
> +#include "qemu/option.h"
> +#include "qemu/cutils.h"
> +#include "qemu/units.h"
> +#include "qemu-common.h"
> +#include "qapi/error.h"
> +#include "qapi/qmp/qerror.h"
> +#include "qapi/qapi-visit-common.h"
> +#include "qapi/visitor.h"
> +#include "sysemu/qtest.h"
> +#include "sysemu/numa.h"
> +#include "sysemu/replay.h"
> +#include "sysemu/sysemu.h"
> +
> +#include "hw/i386/x86.h"
> +#include "hw/i386/pc.h"
> +#include "target/i386/cpu.h"
> +#include "hw/i386/topology.h"
> +#include "hw/i386/fw_cfg.h"
> +
> +#include "hw/acpi/cpu_hotplug.h"
> +#include "hw/nmi.h"
> +#include "hw/loader.h"
> +#include "multiboot.h"
> +#include "elf.h"
> +#include "standard-headers/asm-x86/bootparam.h"
> +
> +#define BIOS_FILENAME "bios.bin"
> +
> +/* Physical Address of PVH entry point read from kernel ELF NOTE */
> +static size_t pvh_start_addr;
> +
> +/* Calculates initial APIC ID for a specific CPU index
> + *
> + * Currently we need to be able to calculate the APIC ID from the CPU index
> + * alone (without requiring a CPU object), as the QEMU<->Seabios interfaces have
> + * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
> + * all CPUs up to max_cpus.
> + */
> +uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
> +                                    unsigned int cpu_index)
> +{
> +    MachineState *ms = MACHINE(pcms);
> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    uint32_t correct_id;
> +    static bool warned;
> +
> +    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
> +                                         ms->smp.threads, cpu_index);
> +    if (pcmc->compat_apic_id_mode) {
> +        if (cpu_index != correct_id && !warned && !qtest_enabled()) {
> +            error_report("APIC IDs set in compatibility mode, "
> +                         "CPU topology won't match the configuration");
> +            warned = true;
> +        }
> +        return cpu_index;
> +    } else {
> +        return correct_id;
> +    }
> +}
> +
> +void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
> +{
> +    Object *cpu = NULL;
> +    Error *local_err = NULL;
> +    CPUX86State *env = NULL;
> +
> +    cpu = object_new(MACHINE(pcms)->cpu_type);
> +
> +    env = &X86_CPU(cpu)->env;
> +    env->nr_dies = pcms->smp_dies;
> +
> +    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
> +    object_property_set_bool(cpu, true, "realized", &local_err);
> +
> +    object_unref(cpu);
> +    error_propagate(errp, local_err);
> +}
> +
> +void x86_cpus_init(PCMachineState *pcms)
> +{
> +    int i;
> +    const CPUArchIdList *possible_cpus;
> +    MachineState *ms = MACHINE(pcms);
> +    MachineClass *mc = MACHINE_GET_CLASS(pcms);
> +    PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
> +
> +    x86_cpu_set_default_version(pcmc->default_cpu_version);
> +
> +    /* Calculates the limit to CPU APIC ID values
> +     *
> +     * Limit for the APIC ID value, so that all
> +     * CPU APIC IDs are < pcms->apic_id_limit.
> +     *
> +     * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
> +     */
> +    pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
> +                                                     ms->smp.max_cpus - 1) + 1;
> +    possible_cpus = mc->possible_cpu_arch_ids(ms);
> +    for (i = 0; i < ms->smp.cpus; i++) {
> +        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
> +    }
> +}
> +
> +CpuInstanceProperties
> +x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> +{
> +    MachineClass *mc = MACHINE_GET_CLASS(ms);
> +    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> +
> +    assert(cpu_index < possible_cpus->len);
> +    return possible_cpus->cpus[cpu_index].props;
> +}
> +
> +int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
> +{
> +   X86CPUTopoInfo topo;
> +   PCMachineState *pcms = PC_MACHINE(ms);
> +
> +   assert(idx < ms->possible_cpus->len);
> +   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
> +                            pcms->smp_dies, ms->smp.cores,
> +                            ms->smp.threads, &topo);
> +   return topo.pkg_id % ms->numa_state->num_nodes;
> +}
> +
> +const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
> +{
> +    PCMachineState *pcms = PC_MACHINE(ms);
> +    int i;
> +    unsigned int max_cpus = ms->smp.max_cpus;
> +
> +    if (ms->possible_cpus) {
> +        /*
> +         * make sure that max_cpus hasn't changed since the first use, i.e.
> +         * -smp hasn't been parsed after it
> +        */
> +        assert(ms->possible_cpus->len == max_cpus);
> +        return ms->possible_cpus;
> +    }
> +
> +    ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
> +                                  sizeof(CPUArchId) * max_cpus);
> +    ms->possible_cpus->len = max_cpus;
> +    for (i = 0; i < ms->possible_cpus->len; i++) {
> +        X86CPUTopoInfo topo;
> +
> +        ms->possible_cpus->cpus[i].type = ms->cpu_type;
> +        ms->possible_cpus->cpus[i].vcpus_count = 1;
> +        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
> +        x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
> +                                 pcms->smp_dies, ms->smp.cores,
> +                                 ms->smp.threads, &topo);
> +        ms->possible_cpus->cpus[i].props.has_socket_id = true;
> +        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
> +        if (pcms->smp_dies > 1) {
> +            ms->possible_cpus->cpus[i].props.has_die_id = true;
> +            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
> +        }
> +        ms->possible_cpus->cpus[i].props.has_core_id = true;
> +        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
> +        ms->possible_cpus->cpus[i].props.has_thread_id = true;
> +        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
> +    }
> +    return ms->possible_cpus;
> +}
> +
> +static long get_file_size(FILE *f)
> +{
> +    long where, size;
> +
> +    /* XXX: on Unix systems, using fstat() probably makes more sense */
> +
> +    where = ftell(f);
> +    fseek(f, 0, SEEK_END);
> +    size = ftell(f);
> +    fseek(f, where, SEEK_SET);
> +
> +    return size;
> +}
> +
> +struct setup_data {
> +    uint64_t next;
> +    uint32_t type;
> +    uint32_t len;
> +    uint8_t data[0];
> +} __attribute__((packed));
> +
> +/*
> + * The entry point into the kernel for PVH boot is different from
> + * the native entry point.  The PVH entry is defined by the x86/HVM
> + * direct boot ABI and is available in an ELFNOTE in the kernel binary.
> + *
> + * This function is passed to load_elf() when it is called from
> + * load_elfboot() which then additionally checks for an ELF Note of
> + * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
> + * parse the PVH entry address from the ELF Note.
> + *
> + * Due to trickery in elf_opts.h, load_elf() is actually available as
> + * load_elf32() or load_elf64() and this routine needs to be able
> + * to deal with being called as 32 or 64 bit.
> + *
> + * The address of the PVH entry point is saved to the 'pvh_start_addr'
> + * global variable.  (although the entry point is 32-bit, the kernel
> + * binary can be either 32-bit or 64-bit).
> + */
> +static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
> +{
> +    size_t *elf_note_data_addr;
> +
> +    /* Check if ELF Note header passed in is valid */
> +    if (arg1 == NULL) {
> +        return 0;
> +    }
> +
> +    if (is64) {
> +        struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
> +        uint64_t nhdr_size64 = sizeof(struct elf64_note);
> +        uint64_t phdr_align = *(uint64_t *)arg2;
> +        uint64_t nhdr_namesz = nhdr64->n_namesz;
> +
> +        elf_note_data_addr =
> +            ((void *)nhdr64) + nhdr_size64 +
> +            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
> +    } else {
> +        struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
> +        uint32_t nhdr_size32 = sizeof(struct elf32_note);
> +        uint32_t phdr_align = *(uint32_t *)arg2;
> +        uint32_t nhdr_namesz = nhdr32->n_namesz;
> +
> +        elf_note_data_addr =
> +            ((void *)nhdr32) + nhdr_size32 +
> +            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
> +    }
> +
> +    pvh_start_addr = *elf_note_data_addr;
> +
> +    return pvh_start_addr;
> +}
> +
> +static bool load_elfboot(const char *kernel_filename,
> +                   int kernel_file_size,
> +                   uint8_t *header,
> +                   size_t pvh_xen_start_addr,
> +                   FWCfgState *fw_cfg)
> +{
> +    uint32_t flags = 0;
> +    uint32_t mh_load_addr = 0;
> +    uint32_t elf_kernel_size = 0;
> +    uint64_t elf_entry;
> +    uint64_t elf_low, elf_high;
> +    int kernel_size;
> +
> +    if (ldl_p(header) != 0x464c457f) {
> +        return false; /* no elfboot */
> +    }
> +
> +    bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
> +    flags = elf_is64 ?
> +        ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
> +
> +    if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
> +        error_report("elfboot unsupported flags = %x", flags);
> +        exit(1);
> +    }
> +
> +    uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
> +    kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
> +                           NULL, &elf_note_type, &elf_entry,
> +                           &elf_low, &elf_high, 0, I386_ELF_MACHINE,
> +                           0, 0);
> +
> +    if (kernel_size < 0) {
> +        error_report("Error while loading elf kernel");
> +        exit(1);
> +    }
> +    mh_load_addr = elf_low;
> +    elf_kernel_size = elf_high - elf_low;
> +
> +    if (pvh_start_addr == 0) {
> +        error_report("Error loading uncompressed kernel without PVH ELF Note");
> +        exit(1);
> +    }
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
> +
> +    return true;
> +}
> +
> +void x86_load_linux(PCMachineState *pcms,
> +                    FWCfgState *fw_cfg)
> +{
> +    uint16_t protocol;
> +    int setup_size, kernel_size, cmdline_size;
> +    int dtb_size, setup_data_offset;
> +    uint32_t initrd_max;
> +    uint8_t header[8192], *setup, *kernel;
> +    hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
> +    FILE *f;
> +    char *vmode;
> +    MachineState *machine = MACHINE(pcms);
> +    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> +    struct setup_data *setup_data;
> +    const char *kernel_filename = machine->kernel_filename;
> +    const char *initrd_filename = machine->initrd_filename;
> +    const char *dtb_filename = machine->dtb;
> +    const char *kernel_cmdline = machine->kernel_cmdline;
> +
> +    /* Align to 16 bytes as a paranoia measure */
> +    cmdline_size = (strlen(kernel_cmdline)+16) & ~15;
> +
> +    /* load the kernel header */
> +    f = fopen(kernel_filename, "rb");
> +    if (!f || !(kernel_size = get_file_size(f)) ||
> +        fread(header, 1, MIN(ARRAY_SIZE(header), kernel_size), f) !=
> +        MIN(ARRAY_SIZE(header), kernel_size)) {
> +        fprintf(stderr, "qemu: could not load kernel '%s': %s\n",
> +                kernel_filename, strerror(errno));
> +        exit(1);
> +    }
> +
> +    /* kernel protocol version */
> +#if 0
> +    fprintf(stderr, "header magic: %#x\n", ldl_p(header+0x202));
> +#endif
> +    if (ldl_p(header+0x202) == 0x53726448) {
> +        protocol = lduw_p(header+0x206);
> +    } else {
> +        /*
> +         * This could be a multiboot kernel. If it is, let's stop treating it
> +         * like a Linux kernel.
> +         * Note: some multiboot images could be in the ELF format (the same of
> +         * PVH), so we try multiboot first since we check the multiboot magic
> +         * header before to load it.
> +         */
> +        if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
> +                           kernel_cmdline, kernel_size, header)) {
> +            return;
> +        }
> +        /*
> +         * Check if the file is an uncompressed kernel file (ELF) and load it,
> +         * saving the PVH entry point used by the x86/HVM direct boot ABI.
> +         * If load_elfboot() is successful, populate the fw_cfg info.
> +         */
> +        if (pcmc->pvh_enabled &&
> +            load_elfboot(kernel_filename, kernel_size,
> +                         header, pvh_start_addr, fw_cfg)) {
> +            fclose(f);
> +
> +            fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
> +                strlen(kernel_cmdline) + 1);
> +            fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
> +
> +            fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
> +            fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
> +                             header, sizeof(header));
> +
> +            /* load initrd */
> +            if (initrd_filename) {
> +                GMappedFile *mapped_file;
> +                gsize initrd_size;
> +                gchar *initrd_data;
> +                GError *gerr = NULL;
> +
> +                mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
> +                if (!mapped_file) {
> +                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> +                            initrd_filename, gerr->message);
> +                    exit(1);
> +                }
> +                pcms->initrd_mapped_file = mapped_file;
> +
> +                initrd_data = g_mapped_file_get_contents(mapped_file);
> +                initrd_size = g_mapped_file_get_length(mapped_file);
> +                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> +                if (initrd_size >= initrd_max) {
> +                    fprintf(stderr, "qemu: initrd is too large, cannot support."
> +                            "(max: %"PRIu32", need %"PRId64")\n",
> +                            initrd_max, (uint64_t)initrd_size);
> +                    exit(1);
> +                }
> +
> +                initrd_addr = (initrd_max - initrd_size) & ~4095;
> +
> +                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> +                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> +                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
> +                                 initrd_size);
> +            }
> +
> +            option_rom[nb_option_roms].bootindex = 0;
> +            option_rom[nb_option_roms].name = "pvh.bin";
> +            nb_option_roms++;
> +
> +            return;
> +        }
> +        protocol = 0;
> +    }
> +
> +    if (protocol < 0x200 || !(header[0x211] & 0x01)) {
> +        /* Low kernel */
> +        real_addr    = 0x90000;
> +        cmdline_addr = 0x9a000 - cmdline_size;
> +        prot_addr    = 0x10000;
> +    } else if (protocol < 0x202) {
> +        /* High but ancient kernel */
> +        real_addr    = 0x90000;
> +        cmdline_addr = 0x9a000 - cmdline_size;
> +        prot_addr    = 0x100000;
> +    } else {
> +        /* High and recent kernel */
> +        real_addr    = 0x10000;
> +        cmdline_addr = 0x20000;
> +        prot_addr    = 0x100000;
> +    }
> +
> +#if 0
> +    fprintf(stderr,
> +            "qemu: real_addr     = 0x" TARGET_FMT_plx "\n"
> +            "qemu: cmdline_addr  = 0x" TARGET_FMT_plx "\n"
> +            "qemu: prot_addr     = 0x" TARGET_FMT_plx "\n",
> +            real_addr,
> +            cmdline_addr,
> +            prot_addr);
> +#endif
> +
> +    /* highest address for loading the initrd */
> +    if (protocol >= 0x20c &&
> +        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
> +        /*
> +         * Linux has supported initrd up to 4 GB for a very long time (2007,
> +         * long before XLF_CAN_BE_LOADED_ABOVE_4G which was added in 2013),
> +         * though it only sets initrd_max to 2 GB to "work around bootloader
> +         * bugs". Luckily, QEMU firmware(which does something like bootloader)
> +         * has supported this.
> +         *
> +         * It's believed that if XLF_CAN_BE_LOADED_ABOVE_4G is set, initrd can
> +         * be loaded into any address.
> +         *
> +         * In addition, initrd_max is uint32_t simply because QEMU doesn't
> +         * support the 64-bit boot protocol (specifically the ext_ramdisk_image
> +         * field).
> +         *
> +         * Therefore here just limit initrd_max to UINT32_MAX simply as well.
> +         */
> +        initrd_max = UINT32_MAX;
> +    } else if (protocol >= 0x203) {
> +        initrd_max = ldl_p(header+0x22c);
> +    } else {
> +        initrd_max = 0x37ffffff;
> +    }
> +
> +    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
> +        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> +    }
> +
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline)+1);
> +    fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
> +
> +    if (protocol >= 0x202) {
> +        stl_p(header+0x228, cmdline_addr);
> +    } else {
> +        stw_p(header+0x20, 0xA33F);
> +        stw_p(header+0x22, cmdline_addr-real_addr);
> +    }
> +
> +    /* handle vga= parameter */
> +    vmode = strstr(kernel_cmdline, "vga=");
> +    if (vmode) {
> +        unsigned int video_mode;
> +        /* skip "vga=" */
> +        vmode += 4;
> +        if (!strncmp(vmode, "normal", 6)) {
> +            video_mode = 0xffff;
> +        } else if (!strncmp(vmode, "ext", 3)) {
> +            video_mode = 0xfffe;
> +        } else if (!strncmp(vmode, "ask", 3)) {
> +            video_mode = 0xfffd;
> +        } else {
> +            video_mode = strtol(vmode, NULL, 0);
> +        }
> +        stw_p(header+0x1fa, video_mode);
> +    }
> +
> +    /* loader type */
> +    /* High nybble = B reserved for QEMU; low nybble is revision number.
> +       If this code is substantially changed, you may want to consider
> +       incrementing the revision. */
> +    if (protocol >= 0x200) {
> +        header[0x210] = 0xB0;
> +    }
> +    /* heap */
> +    if (protocol >= 0x201) {
> +        header[0x211] |= 0x80;	/* CAN_USE_HEAP */
> +        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
> +    }
> +
> +    /* load initrd */
> +    if (initrd_filename) {
> +        GMappedFile *mapped_file;
> +        gsize initrd_size;
> +        gchar *initrd_data;
> +        GError *gerr = NULL;
> +
> +        if (protocol < 0x200) {
> +            fprintf(stderr, "qemu: linux kernel too old to load a ram disk\n");
> +            exit(1);
> +        }
> +
> +        mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
> +        if (!mapped_file) {
> +            fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> +                    initrd_filename, gerr->message);
> +            exit(1);
> +        }
> +        pcms->initrd_mapped_file = mapped_file;
> +
> +        initrd_data = g_mapped_file_get_contents(mapped_file);
> +        initrd_size = g_mapped_file_get_length(mapped_file);
> +        if (initrd_size >= initrd_max) {
> +            fprintf(stderr, "qemu: initrd is too large, cannot support."
> +                    "(max: %"PRIu32", need %"PRId64")\n",
> +                    initrd_max, (uint64_t)initrd_size);
> +            exit(1);
> +        }
> +
> +        initrd_addr = (initrd_max-initrd_size) & ~4095;
> +
> +        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> +        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> +        fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
> +
> +        stl_p(header+0x218, initrd_addr);
> +        stl_p(header+0x21c, initrd_size);
> +    }
> +
> +    /* load kernel and setup */
> +    setup_size = header[0x1f1];
> +    if (setup_size == 0) {
> +        setup_size = 4;
> +    }
> +    setup_size = (setup_size+1)*512;
> +    if (setup_size > kernel_size) {
> +        fprintf(stderr, "qemu: invalid kernel header\n");
> +        exit(1);
> +    }
> +    kernel_size -= setup_size;
> +
> +    setup  = g_malloc(setup_size);
> +    kernel = g_malloc(kernel_size);
> +    fseek(f, 0, SEEK_SET);
> +    if (fread(setup, 1, setup_size, f) != setup_size) {
> +        fprintf(stderr, "fread() failed\n");
> +        exit(1);
> +    }
> +    if (fread(kernel, 1, kernel_size, f) != kernel_size) {
> +        fprintf(stderr, "fread() failed\n");
> +        exit(1);
> +    }
> +    fclose(f);
> +
> +    /* append dtb to kernel */
> +    if (dtb_filename) {
> +        if (protocol < 0x209) {
> +            fprintf(stderr, "qemu: Linux kernel too old to load a dtb\n");
> +            exit(1);
> +        }
> +
> +        dtb_size = get_image_size(dtb_filename);
> +        if (dtb_size <= 0) {
> +            fprintf(stderr, "qemu: error reading dtb %s: %s\n",
> +                    dtb_filename, strerror(errno));
> +            exit(1);
> +        }
> +
> +        setup_data_offset = QEMU_ALIGN_UP(kernel_size, 16);
> +        kernel_size = setup_data_offset + sizeof(struct setup_data) + dtb_size;
> +        kernel = g_realloc(kernel, kernel_size);
> +
> +        stq_p(header+0x250, prot_addr + setup_data_offset);
> +
> +        setup_data = (struct setup_data *)(kernel + setup_data_offset);
> +        setup_data->next = 0;
> +        setup_data->type = cpu_to_le32(SETUP_DTB);
> +        setup_data->len = cpu_to_le32(dtb_size);
> +
> +        load_image_size(dtb_filename, setup_data->data, dtb_size);
> +    }
> +
> +    memcpy(setup, header, MIN(sizeof(header), setup_size));
> +
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, prot_addr);
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, kernel_size);
> +    fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, kernel, kernel_size);
> +
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_ADDR, real_addr);
> +    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
> +    fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
> +
> +    option_rom[nb_option_roms].bootindex = 0;
> +    option_rom[nb_option_roms].name = "linuxboot.bin";
> +    if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
> +        option_rom[nb_option_roms].name = "linuxboot_dma.bin";
> +    }
> +    nb_option_roms++;
> +}
> +
> +void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
> +{
> +    char *filename;
> +    MemoryRegion *bios, *isa_bios;
> +    int bios_size, isa_bios_size;
> +    int ret;
> +
> +    /* BIOS load */
> +    if (bios_name == NULL) {
> +        bios_name = BIOS_FILENAME;
> +    }
> +    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
> +    if (filename) {
> +        bios_size = get_image_size(filename);
> +    } else {
> +        bios_size = -1;
> +    }
> +    if (bios_size <= 0 ||
> +        (bios_size % 65536) != 0) {
> +        goto bios_error;
> +    }
> +    bios = g_malloc(sizeof(*bios));
> +    memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
> +    if (!isapc_ram_fw) {
> +        memory_region_set_readonly(bios, true);
> +    }
> +    ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
> +    if (ret != 0) {
> +    bios_error:
> +        fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
> +        exit(1);
> +    }
> +    g_free(filename);
> +
> +    /* map the last 128KB of the BIOS in ISA space */
> +    isa_bios_size = MIN(bios_size, 128 * KiB);
> +    isa_bios = g_malloc(sizeof(*isa_bios));
> +    memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
> +                             bios_size - isa_bios_size, isa_bios_size);
> +    memory_region_add_subregion_overlap(rom_memory,
> +                                        0x100000 - isa_bios_size,
> +                                        isa_bios,
> +                                        1);
> +    if (!isapc_ram_fw) {
> +        memory_region_set_readonly(isa_bios, true);
> +    }
> +
> +    /* map all the bios at the top of memory */
> +    memory_region_add_subregion(rom_memory,
> +                                (uint32_t)(-bios_size),
> +                                bios);
> +}
> diff --git a/hw/i386/Makefile.objs b/hw/i386/Makefile.objs
> index d3374e0831..7ed80a4853 100644
> --- a/hw/i386/Makefile.objs
> +++ b/hw/i386/Makefile.objs
> @@ -1,5 +1,6 @@
>  obj-$(CONFIG_KVM) += kvm/
>  obj-y += e820_memory_layout.o multiboot.o
> +obj-y += x86.o
>  obj-y += pc.o
>  obj-$(CONFIG_I440FX) += pc_piix.o
>  obj-$(CONFIG_Q35) += pc_q35.o
> -- 
> 2.21.0
> 

-- 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 02/10] hw/i386/pc: rename functions shared with non-PC machines
  2019-10-04  9:37 ` [PATCH v6 02/10] hw/i386/pc: rename functions shared with non-PC machines Sergio Lopez
  2019-10-04  9:46   ` Philippe Mathieu-Daudé
@ 2019-10-04 11:24   ` Stefano Garzarella
  1 sibling, 0 replies; 31+ messages in thread
From: Stefano Garzarella @ 2019-10-04 11:24 UTC (permalink / raw)
  To: Sergio Lopez
  Cc: ehabkost, mst, lersek, qemu-devel, kraxel, pbonzini, imammedo,
	philmd, rth

On Fri, Oct 04, 2019 at 11:37:44AM +0200, Sergio Lopez wrote:
> The following functions are named *pc* but are not PC-machine specific
> but generic to the X86 architecture, rename them:
> 
>   load_linux                 -> x86_load_linux
>   pc_new_cpu                 -> x86_new_cpu
>   pc_cpus_init               -> x86_cpus_init
>   pc_cpu_index_to_props      -> x86_cpu_index_to_props
>   pc_get_default_cpu_node_id -> x86_get_default_cpu_node_id
>   pc_possible_cpu_arch_ids   -> x86_possible_cpu_arch_ids
>   old_pc_system_rom_init     -> x86_system_rom_init
> 
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
>  include/hw/i386/pc.h |  2 +-
>  hw/i386/pc.c         | 28 ++++++++++++++--------------
>  hw/i386/pc_piix.c    |  2 +-
>  hw/i386/pc_q35.c     |  2 +-
>  hw/i386/pc_sysfw.c   |  6 +++---
>  5 files changed, 20 insertions(+), 20 deletions(-)
> 

LGTM!

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>

Thanks,
Stefano


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them
  2019-10-04  9:37 ` [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them Sergio Lopez
  2019-10-04  9:46   ` Philippe Mathieu-Daudé
  2019-10-04 11:23   ` Stefano Garzarella
@ 2019-10-04 11:36   ` Stefano Garzarella
  2019-10-07 13:46     ` Sergio Lopez
  2 siblings, 1 reply; 31+ messages in thread
From: Stefano Garzarella @ 2019-10-04 11:36 UTC (permalink / raw)
  To: Sergio Lopez
  Cc: ehabkost, mst, lersek, qemu-devel, kraxel, pbonzini, imammedo,
	philmd, rth

On Fri, Oct 04, 2019 at 11:37:45AM +0200, Sergio Lopez wrote:
> Move x86 functions that will be shared between PC and non-PC machine
> types to x86.c, along with their helpers.
> 
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
>  include/hw/i386/pc.h  |   1 -
>  include/hw/i386/x86.h |  35 +++
>  hw/i386/pc.c          | 582 +----------------------------------
>  hw/i386/pc_piix.c     |   1 +
>  hw/i386/pc_q35.c      |   1 +
>  hw/i386/pc_sysfw.c    |  54 +---
>  hw/i386/x86.c         | 684 ++++++++++++++++++++++++++++++++++++++++++
>  hw/i386/Makefile.objs |   1 +
>  8 files changed, 724 insertions(+), 635 deletions(-)
>  create mode 100644 include/hw/i386/x86.h
>  create mode 100644 hw/i386/x86.c
> 
> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
> index d12f42e9e5..73e2847e87 100644
> --- a/include/hw/i386/pc.h
> +++ b/include/hw/i386/pc.h
> @@ -195,7 +195,6 @@ bool pc_machine_is_smm_enabled(PCMachineState *pcms);
>  void pc_register_ferr_irq(qemu_irq irq);
>  void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
>  
> -void x86_cpus_init(PCMachineState *pcms);
>  void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp);
>  void pc_smp_parse(MachineState *ms, QemuOpts *opts);
>  
> diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
> new file mode 100644
> index 0000000000..71e2b6985d
> --- /dev/null
> +++ b/include/hw/i386/x86.h
> @@ -0,0 +1,35 @@
> +/*
> + * Copyright (c) 2019 Red Hat, Inc.
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms and conditions of the GNU General Public License,
> + * version 2 or later, as published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope it will be useful, but WITHOUT
> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
> + * more details.
> + *
> + * You should have received a copy of the GNU General Public License along with
> + * this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#ifndef HW_I386_X86_H
> +#define HW_I386_X86_H
> +
> +#include "hw/boards.h"
> +
> +uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
> +                                    unsigned int cpu_index);
> +void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp);
> +void x86_cpus_init(PCMachineState *pcms);
> +CpuInstanceProperties x86_cpu_index_to_props(MachineState *ms,
> +                                             unsigned cpu_index);
> +int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx);
> +const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
> +
> +void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw);
> +
> +void x86_load_linux(PCMachineState *x86ms, FWCfgState *fw_cfg);
> +
> +#endif
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index fd08c6704b..094db79fb0 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -24,6 +24,7 @@
>  
>  #include "qemu/osdep.h"
>  #include "qemu/units.h"
> +#include "hw/i386/x86.h"
>  #include "hw/i386/pc.h"
>  #include "hw/char/serial.h"
>  #include "hw/char/parallel.h"
> @@ -102,9 +103,6 @@
>  
>  struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
>  
> -/* Physical Address of PVH entry point read from kernel ELF NOTE */
> -static size_t pvh_start_addr;
> -
>  GlobalProperty pc_compat_4_1[] = {};
>  const size_t pc_compat_4_1_len = G_N_ELEMENTS(pc_compat_4_1);
>  
> @@ -866,478 +864,6 @@ static void handle_a20_line_change(void *opaque, int irq, int level)
>      x86_cpu_set_a20(cpu, level);
>  }
>  
> -/* Calculates initial APIC ID for a specific CPU index
> - *
> - * Currently we need to be able to calculate the APIC ID from the CPU index
> - * alone (without requiring a CPU object), as the QEMU<->Seabios interfaces have
> - * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
> - * all CPUs up to max_cpus.
> - */
> -static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
> -                                           unsigned int cpu_index)
> -{
> -    MachineState *ms = MACHINE(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> -    uint32_t correct_id;
> -    static bool warned;
> -
> -    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
> -                                         ms->smp.threads, cpu_index);
> -    if (pcmc->compat_apic_id_mode) {
> -        if (cpu_index != correct_id && !warned && !qtest_enabled()) {
> -            error_report("APIC IDs set in compatibility mode, "
> -                         "CPU topology won't match the configuration");
> -            warned = true;
> -        }
> -        return cpu_index;
> -    } else {
> -        return correct_id;
> -    }
> -}
> -
> -static long get_file_size(FILE *f)
> -{
> -    long where, size;
> -
> -    /* XXX: on Unix systems, using fstat() probably makes more sense */
> -
> -    where = ftell(f);
> -    fseek(f, 0, SEEK_END);
> -    size = ftell(f);
> -    fseek(f, where, SEEK_SET);
> -
> -    return size;
> -}
> -
> -struct setup_data {
> -    uint64_t next;
> -    uint32_t type;
> -    uint32_t len;
> -    uint8_t data[0];
> -} __attribute__((packed));
> -
> -
> -/*
> - * The entry point into the kernel for PVH boot is different from
> - * the native entry point.  The PVH entry is defined by the x86/HVM
> - * direct boot ABI and is available in an ELFNOTE in the kernel binary.
> - *
> - * This function is passed to load_elf() when it is called from
> - * load_elfboot() which then additionally checks for an ELF Note of
> - * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
> - * parse the PVH entry address from the ELF Note.
> - *
> - * Due to trickery in elf_opts.h, load_elf() is actually available as
> - * load_elf32() or load_elf64() and this routine needs to be able
> - * to deal with being called as 32 or 64 bit.
> - *
> - * The address of the PVH entry point is saved to the 'pvh_start_addr'
> - * global variable.  (although the entry point is 32-bit, the kernel
> - * binary can be either 32-bit or 64-bit).
> - */
> -static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
> -{
> -    size_t *elf_note_data_addr;
> -
> -    /* Check if ELF Note header passed in is valid */
> -    if (arg1 == NULL) {
> -        return 0;
> -    }
> -
> -    if (is64) {
> -        struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
> -        uint64_t nhdr_size64 = sizeof(struct elf64_note);
> -        uint64_t phdr_align = *(uint64_t *)arg2;
> -        uint64_t nhdr_namesz = nhdr64->n_namesz;
> -
> -        elf_note_data_addr =
> -            ((void *)nhdr64) + nhdr_size64 +
> -            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
> -    } else {
> -        struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
> -        uint32_t nhdr_size32 = sizeof(struct elf32_note);
> -        uint32_t phdr_align = *(uint32_t *)arg2;
> -        uint32_t nhdr_namesz = nhdr32->n_namesz;
> -
> -        elf_note_data_addr =
> -            ((void *)nhdr32) + nhdr_size32 +
> -            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
> -    }
> -
> -    pvh_start_addr = *elf_note_data_addr;
> -
> -    return pvh_start_addr;
> -}
> -
> -static bool load_elfboot(const char *kernel_filename,
> -                   int kernel_file_size,
> -                   uint8_t *header,
> -                   size_t pvh_xen_start_addr,
> -                   FWCfgState *fw_cfg)
> -{
> -    uint32_t flags = 0;
> -    uint32_t mh_load_addr = 0;
> -    uint32_t elf_kernel_size = 0;
> -    uint64_t elf_entry;
> -    uint64_t elf_low, elf_high;
> -    int kernel_size;
> -
> -    if (ldl_p(header) != 0x464c457f) {
> -        return false; /* no elfboot */
> -    }
> -
> -    bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
> -    flags = elf_is64 ?
> -        ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
> -
> -    if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
> -        error_report("elfboot unsupported flags = %x", flags);
> -        exit(1);
> -    }
> -
> -    uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
> -    kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
> -                           NULL, &elf_note_type, &elf_entry,
> -                           &elf_low, &elf_high, 0, I386_ELF_MACHINE,
> -                           0, 0);
> -
> -    if (kernel_size < 0) {
> -        error_report("Error while loading elf kernel");
> -        exit(1);
> -    }
> -    mh_load_addr = elf_low;
> -    elf_kernel_size = elf_high - elf_low;
> -
> -    if (pvh_start_addr == 0) {
> -        error_report("Error loading uncompressed kernel without PVH ELF Note");
> -        exit(1);
> -    }
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
> -
> -    return true;
> -}
> -
> -static void x86_load_linux(PCMachineState *pcms,
> -                           FWCfgState *fw_cfg)
> -{
> -    uint16_t protocol;
> -    int setup_size, kernel_size, cmdline_size;
> -    int dtb_size, setup_data_offset;
> -    uint32_t initrd_max;
> -    uint8_t header[8192], *setup, *kernel;
> -    hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
> -    FILE *f;
> -    char *vmode;
> -    MachineState *machine = MACHINE(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
> -    struct setup_data *setup_data;
> -    const char *kernel_filename = machine->kernel_filename;
> -    const char *initrd_filename = machine->initrd_filename;
> -    const char *dtb_filename = machine->dtb;
> -    const char *kernel_cmdline = machine->kernel_cmdline;
> -
> -    /* Align to 16 bytes as a paranoia measure */
> -    cmdline_size = (strlen(kernel_cmdline)+16) & ~15;
> -
> -    /* load the kernel header */
> -    f = fopen(kernel_filename, "rb");
> -    if (!f || !(kernel_size = get_file_size(f)) ||
> -        fread(header, 1, MIN(ARRAY_SIZE(header), kernel_size), f) !=
> -        MIN(ARRAY_SIZE(header), kernel_size)) {
> -        fprintf(stderr, "qemu: could not load kernel '%s': %s\n",
> -                kernel_filename, strerror(errno));
> -        exit(1);
> -    }
> -
> -    /* kernel protocol version */
> -#if 0
> -    fprintf(stderr, "header magic: %#x\n", ldl_p(header+0x202));
> -#endif
> -    if (ldl_p(header+0x202) == 0x53726448) {
> -        protocol = lduw_p(header+0x206);
> -    } else {
> -        /*
> -         * This could be a multiboot kernel. If it is, let's stop treating it
> -         * like a Linux kernel.
> -         * Note: some multiboot images could be in the ELF format (the same of
> -         * PVH), so we try multiboot first since we check the multiboot magic
> -         * header before to load it.
> -         */
> -        if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
> -                           kernel_cmdline, kernel_size, header)) {
> -            return;
> -        }
> -        /*
> -         * Check if the file is an uncompressed kernel file (ELF) and load it,
> -         * saving the PVH entry point used by the x86/HVM direct boot ABI.
> -         * If load_elfboot() is successful, populate the fw_cfg info.
> -         */
> -        if (pcmc->pvh_enabled &&
> -            load_elfboot(kernel_filename, kernel_size,
> -                         header, pvh_start_addr, fw_cfg)) {
> -            fclose(f);
> -
> -            fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
> -                strlen(kernel_cmdline) + 1);
> -            fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
> -
> -            fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
> -            fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
> -                             header, sizeof(header));
> -
> -            /* load initrd */
> -            if (initrd_filename) {
> -                GMappedFile *mapped_file;
> -                gsize initrd_size;
> -                gchar *initrd_data;
> -                GError *gerr = NULL;
> -
> -                mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
> -                if (!mapped_file) {
> -                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> -                            initrd_filename, gerr->message);
> -                    exit(1);
> -                }
> -                pcms->initrd_mapped_file = mapped_file;
> -
> -                initrd_data = g_mapped_file_get_contents(mapped_file);
> -                initrd_size = g_mapped_file_get_length(mapped_file);
> -                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> -                if (initrd_size >= initrd_max) {
> -                    fprintf(stderr, "qemu: initrd is too large, cannot support."
> -                            "(max: %"PRIu32", need %"PRId64")\n",
> -                            initrd_max, (uint64_t)initrd_size);
> -                    exit(1);
> -                }
> -
> -                initrd_addr = (initrd_max - initrd_size) & ~4095;
> -
> -                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> -                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> -                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
> -                                 initrd_size);
> -            }
> -
> -            option_rom[nb_option_roms].bootindex = 0;
> -            option_rom[nb_option_roms].name = "pvh.bin";
> -            nb_option_roms++;
> -
> -            return;
> -        }
> -        protocol = 0;
> -    }
> -
> -    if (protocol < 0x200 || !(header[0x211] & 0x01)) {
> -        /* Low kernel */
> -        real_addr    = 0x90000;
> -        cmdline_addr = 0x9a000 - cmdline_size;
> -        prot_addr    = 0x10000;
> -    } else if (protocol < 0x202) {
> -        /* High but ancient kernel */
> -        real_addr    = 0x90000;
> -        cmdline_addr = 0x9a000 - cmdline_size;
> -        prot_addr    = 0x100000;
> -    } else {
> -        /* High and recent kernel */
> -        real_addr    = 0x10000;
> -        cmdline_addr = 0x20000;
> -        prot_addr    = 0x100000;
> -    }
> -
> -#if 0
> -    fprintf(stderr,
> -            "qemu: real_addr     = 0x" TARGET_FMT_plx "\n"
> -            "qemu: cmdline_addr  = 0x" TARGET_FMT_plx "\n"
> -            "qemu: prot_addr     = 0x" TARGET_FMT_plx "\n",
> -            real_addr,
> -            cmdline_addr,
> -            prot_addr);
> -#endif
> -
> -    /* highest address for loading the initrd */
> -    if (protocol >= 0x20c &&
> -        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
> -        /*
> -         * Linux has supported initrd up to 4 GB for a very long time (2007,
> -         * long before XLF_CAN_BE_LOADED_ABOVE_4G which was added in 2013),
> -         * though it only sets initrd_max to 2 GB to "work around bootloader
> -         * bugs". Luckily, QEMU firmware(which does something like bootloader)
> -         * has supported this.
> -         *
> -         * It's believed that if XLF_CAN_BE_LOADED_ABOVE_4G is set, initrd can
> -         * be loaded into any address.
> -         *
> -         * In addition, initrd_max is uint32_t simply because QEMU doesn't
> -         * support the 64-bit boot protocol (specifically the ext_ramdisk_image
> -         * field).
> -         *
> -         * Therefore here just limit initrd_max to UINT32_MAX simply as well.
> -         */
> -        initrd_max = UINT32_MAX;
> -    } else if (protocol >= 0x203) {
> -        initrd_max = ldl_p(header+0x22c);
> -    } else {
> -        initrd_max = 0x37ffffff;
> -    }
> -
> -    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
> -        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> -    }
> -
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline)+1);
> -    fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
> -
> -    if (protocol >= 0x202) {
> -        stl_p(header+0x228, cmdline_addr);
> -    } else {
> -        stw_p(header+0x20, 0xA33F);
> -        stw_p(header+0x22, cmdline_addr-real_addr);
> -    }
> -
> -    /* handle vga= parameter */
> -    vmode = strstr(kernel_cmdline, "vga=");
> -    if (vmode) {
> -        unsigned int video_mode;
> -        /* skip "vga=" */
> -        vmode += 4;
> -        if (!strncmp(vmode, "normal", 6)) {
> -            video_mode = 0xffff;
> -        } else if (!strncmp(vmode, "ext", 3)) {
> -            video_mode = 0xfffe;
> -        } else if (!strncmp(vmode, "ask", 3)) {
> -            video_mode = 0xfffd;
> -        } else {
> -            video_mode = strtol(vmode, NULL, 0);
> -        }
> -        stw_p(header+0x1fa, video_mode);
> -    }
> -
> -    /* loader type */
> -    /* High nybble = B reserved for QEMU; low nybble is revision number.
> -       If this code is substantially changed, you may want to consider
> -       incrementing the revision. */
> -    if (protocol >= 0x200) {
> -        header[0x210] = 0xB0;
> -    }
> -    /* heap */
> -    if (protocol >= 0x201) {
> -        header[0x211] |= 0x80;	/* CAN_USE_HEAP */
> -        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
> -    }
> -
> -    /* load initrd */
> -    if (initrd_filename) {
> -        GMappedFile *mapped_file;
> -        gsize initrd_size;
> -        gchar *initrd_data;
> -        GError *gerr = NULL;
> -
> -        if (protocol < 0x200) {
> -            fprintf(stderr, "qemu: linux kernel too old to load a ram disk\n");
> -            exit(1);
> -        }
> -
> -        mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
> -        if (!mapped_file) {
> -            fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> -                    initrd_filename, gerr->message);
> -            exit(1);
> -        }
> -        pcms->initrd_mapped_file = mapped_file;
> -
> -        initrd_data = g_mapped_file_get_contents(mapped_file);
> -        initrd_size = g_mapped_file_get_length(mapped_file);
> -        if (initrd_size >= initrd_max) {
> -            fprintf(stderr, "qemu: initrd is too large, cannot support."
> -                    "(max: %"PRIu32", need %"PRId64")\n",
> -                    initrd_max, (uint64_t)initrd_size);
> -            exit(1);
> -        }
> -
> -        initrd_addr = (initrd_max-initrd_size) & ~4095;
> -
> -        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> -        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> -        fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
> -
> -        stl_p(header+0x218, initrd_addr);
> -        stl_p(header+0x21c, initrd_size);
> -    }
> -
> -    /* load kernel and setup */
> -    setup_size = header[0x1f1];
> -    if (setup_size == 0) {
> -        setup_size = 4;
> -    }
> -    setup_size = (setup_size+1)*512;
> -    if (setup_size > kernel_size) {
> -        fprintf(stderr, "qemu: invalid kernel header\n");
> -        exit(1);
> -    }
> -    kernel_size -= setup_size;
> -
> -    setup  = g_malloc(setup_size);
> -    kernel = g_malloc(kernel_size);
> -    fseek(f, 0, SEEK_SET);
> -    if (fread(setup, 1, setup_size, f) != setup_size) {
> -        fprintf(stderr, "fread() failed\n");
> -        exit(1);
> -    }
> -    if (fread(kernel, 1, kernel_size, f) != kernel_size) {
> -        fprintf(stderr, "fread() failed\n");
> -        exit(1);
> -    }
> -    fclose(f);
> -
> -    /* append dtb to kernel */
> -    if (dtb_filename) {
> -        if (protocol < 0x209) {
> -            fprintf(stderr, "qemu: Linux kernel too old to load a dtb\n");
> -            exit(1);
> -        }
> -
> -        dtb_size = get_image_size(dtb_filename);
> -        if (dtb_size <= 0) {
> -            fprintf(stderr, "qemu: error reading dtb %s: %s\n",
> -                    dtb_filename, strerror(errno));
> -            exit(1);
> -        }
> -
> -        setup_data_offset = QEMU_ALIGN_UP(kernel_size, 16);
> -        kernel_size = setup_data_offset + sizeof(struct setup_data) + dtb_size;
> -        kernel = g_realloc(kernel, kernel_size);
> -
> -        stq_p(header+0x250, prot_addr + setup_data_offset);
> -
> -        setup_data = (struct setup_data *)(kernel + setup_data_offset);
> -        setup_data->next = 0;
> -        setup_data->type = cpu_to_le32(SETUP_DTB);
> -        setup_data->len = cpu_to_le32(dtb_size);
> -
> -        load_image_size(dtb_filename, setup_data->data, dtb_size);
> -    }
> -
> -    memcpy(setup, header, MIN(sizeof(header), setup_size));
> -
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, prot_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, kernel_size);
> -    fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, kernel, kernel_size);
> -
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_ADDR, real_addr);
> -    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
> -    fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
> -
> -    option_rom[nb_option_roms].bootindex = 0;
> -    option_rom[nb_option_roms].name = "linuxboot.bin";
> -    if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
> -        option_rom[nb_option_roms].name = "linuxboot_dma.bin";
> -    }
> -    nb_option_roms++;
> -}
> -
>  #define NE2000_NB_MAX 6
>  
>  static const int ne2000_io[NE2000_NB_MAX] = { 0x300, 0x320, 0x340, 0x360,
> @@ -1374,24 +900,6 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
>      }
>  }
>  
> -static void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
> -{
> -    Object *cpu = NULL;
> -    Error *local_err = NULL;
> -    CPUX86State *env = NULL;
> -
> -    cpu = object_new(MACHINE(pcms)->cpu_type);
> -
> -    env = &X86_CPU(cpu)->env;
> -    env->nr_dies = pcms->smp_dies;
> -
> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
> -    object_property_set_bool(cpu, true, "realized", &local_err);
> -
> -    object_unref(cpu);
> -    error_propagate(errp, local_err);
> -}
> -
>  /*
>   * This function is very similar to smp_parse()
>   * in hw/core/machine.c but includes CPU die support.
> @@ -1497,31 +1005,6 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
>      }
>  }
>  
> -void x86_cpus_init(PCMachineState *pcms)
> -{
> -    int i;
> -    const CPUArchIdList *possible_cpus;
> -    MachineState *ms = MACHINE(pcms);
> -    MachineClass *mc = MACHINE_GET_CLASS(pcms);
> -    PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
> -
> -    x86_cpu_set_default_version(pcmc->default_cpu_version);
> -
> -    /* Calculates the limit to CPU APIC ID values
> -     *
> -     * Limit for the APIC ID value, so that all
> -     * CPU APIC IDs are < pcms->apic_id_limit.
> -     *
> -     * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
> -     */
> -    pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
> -                                                     ms->smp.max_cpus - 1) + 1;
> -    possible_cpus = mc->possible_cpu_arch_ids(ms);
> -    for (i = 0; i < ms->smp.cpus; i++) {
> -        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
> -    }
> -}
> -
>  static void rtc_set_cpus_count(ISADevice *rtc, uint16_t cpus_count)
>  {
>      if (cpus_count > 0xff) {
> @@ -2677,69 +2160,6 @@ static void pc_machine_wakeup(MachineState *machine)
>      cpu_synchronize_all_post_reset();
>  }
>  
> -static CpuInstanceProperties
> -x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
> -{
> -    MachineClass *mc = MACHINE_GET_CLASS(ms);
> -    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
> -
> -    assert(cpu_index < possible_cpus->len);
> -    return possible_cpus->cpus[cpu_index].props;
> -}
> -
> -static int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
> -{
> -   X86CPUTopoInfo topo;
> -   PCMachineState *pcms = PC_MACHINE(ms);
> -
> -   assert(idx < ms->possible_cpus->len);
> -   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
> -                            pcms->smp_dies, ms->smp.cores,
> -                            ms->smp.threads, &topo);
> -   return topo.pkg_id % ms->numa_state->num_nodes;
> -}
> -
> -static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
> -{
> -    PCMachineState *pcms = PC_MACHINE(ms);
> -    int i;
> -    unsigned int max_cpus = ms->smp.max_cpus;
> -
> -    if (ms->possible_cpus) {
> -        /*
> -         * make sure that max_cpus hasn't changed since the first use, i.e.
> -         * -smp hasn't been parsed after it
> -        */
> -        assert(ms->possible_cpus->len == max_cpus);
> -        return ms->possible_cpus;
> -    }
> -
> -    ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
> -                                  sizeof(CPUArchId) * max_cpus);
> -    ms->possible_cpus->len = max_cpus;
> -    for (i = 0; i < ms->possible_cpus->len; i++) {
> -        X86CPUTopoInfo topo;
> -
> -        ms->possible_cpus->cpus[i].type = ms->cpu_type;
> -        ms->possible_cpus->cpus[i].vcpus_count = 1;
> -        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
> -        x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
> -                                 pcms->smp_dies, ms->smp.cores,
> -                                 ms->smp.threads, &topo);
> -        ms->possible_cpus->cpus[i].props.has_socket_id = true;
> -        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
> -        if (pcms->smp_dies > 1) {
> -            ms->possible_cpus->cpus[i].props.has_die_id = true;
> -            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
> -        }
> -        ms->possible_cpus->cpus[i].props.has_core_id = true;
> -        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
> -        ms->possible_cpus->cpus[i].props.has_thread_id = true;
> -        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
> -    }
> -    return ms->possible_cpus;
> -}
> -
>  static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
>  {
>      /* cpu index isn't used */
> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
> index de09e076cd..1396451abf 100644
> --- a/hw/i386/pc_piix.c
> +++ b/hw/i386/pc_piix.c
> @@ -27,6 +27,7 @@
>  
>  #include "qemu/units.h"
>  #include "hw/loader.h"
> +#include "hw/i386/x86.h"
>  #include "hw/i386/pc.h"
>  #include "hw/i386/apic.h"
>  #include "hw/display/ramfb.h"
> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
> index 894989b64e..8920bd8978 100644
> --- a/hw/i386/pc_q35.c
> +++ b/hw/i386/pc_q35.c
> @@ -41,6 +41,7 @@
>  #include "hw/pci-host/q35.h"
>  #include "hw/qdev-properties.h"
>  #include "exec/address-spaces.h"
> +#include "hw/i386/x86.h"
>  #include "hw/i386/pc.h"
>  #include "hw/i386/ich9.h"
>  #include "hw/i386/amd_iommu.h"
> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
> index 28cb1f63c9..69b79851be 100644
> --- a/hw/i386/pc_sysfw.c
> +++ b/hw/i386/pc_sysfw.c
> @@ -31,6 +31,7 @@
>  #include "qemu/option.h"
>  #include "qemu/units.h"
>  #include "hw/sysbus.h"
> +#include "hw/i386/x86.h"
>  #include "hw/i386/pc.h"
>  #include "hw/loader.h"
>  #include "hw/qdev-properties.h"
> @@ -211,59 +212,6 @@ static void pc_system_flash_map(PCMachineState *pcms,
>      }
>  }
>  
> -static void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
> -{
> -    char *filename;
> -    MemoryRegion *bios, *isa_bios;
> -    int bios_size, isa_bios_size;
> -    int ret;
> -
> -    /* BIOS load */
> -    if (bios_name == NULL) {
> -        bios_name = BIOS_FILENAME;
> -    }
> -    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
> -    if (filename) {
> -        bios_size = get_image_size(filename);
> -    } else {
> -        bios_size = -1;
> -    }
> -    if (bios_size <= 0 ||
> -        (bios_size % 65536) != 0) {
> -        goto bios_error;
> -    }
> -    bios = g_malloc(sizeof(*bios));
> -    memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
> -    if (!isapc_ram_fw) {
> -        memory_region_set_readonly(bios, true);
> -    }
> -    ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
> -    if (ret != 0) {
> -    bios_error:
> -        fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
> -        exit(1);
> -    }
> -    g_free(filename);
> -
> -    /* map the last 128KB of the BIOS in ISA space */
> -    isa_bios_size = MIN(bios_size, 128 * KiB);
> -    isa_bios = g_malloc(sizeof(*isa_bios));
> -    memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
> -                             bios_size - isa_bios_size, isa_bios_size);
> -    memory_region_add_subregion_overlap(rom_memory,
> -                                        0x100000 - isa_bios_size,
> -                                        isa_bios,
> -                                        1);
> -    if (!isapc_ram_fw) {
> -        memory_region_set_readonly(isa_bios, true);
> -    }
> -
> -    /* map all the bios at the top of memory */
> -    memory_region_add_subregion(rom_memory,
> -                                (uint32_t)(-bios_size),
> -                                bios);
> -}
> -
>  void pc_system_firmware_init(PCMachineState *pcms,
>                               MemoryRegion *rom_memory)
>  {
> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
> new file mode 100644
> index 0000000000..6807bb8a22
> --- /dev/null
> +++ b/hw/i386/x86.c
> @@ -0,0 +1,684 @@
> +/*
> + * Copyright (c) 2003-2004 Fabrice Bellard
> + * Copyright (c) 2019 Red Hat, Inc.
> + *
> + * Permission is hereby granted, free of charge, to any person obtaining a copy
> + * of this software and associated documentation files (the "Software"), to deal
> + * in the Software without restriction, including without limitation the rights
> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
> + * copies of the Software, and to permit persons to whom the Software is
> + * furnished to do so, subject to the following conditions:
> + *
> + * The above copyright notice and this permission notice shall be included in
> + * all copies or substantial portions of the Software.
> + *
> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
> + * THE SOFTWARE.
> + */
> +#include "qemu/osdep.h"
> +#include "qemu/error-report.h"
> +#include "qemu/option.h"
> +#include "qemu/cutils.h"
> +#include "qemu/units.h"
> +#include "qemu-common.h"
> +#include "qapi/error.h"
> +#include "qapi/qmp/qerror.h"
> +#include "qapi/qapi-visit-common.h"
> +#include "qapi/visitor.h"
> +#include "sysemu/qtest.h"
> +#include "sysemu/numa.h"
> +#include "sysemu/replay.h"
> +#include "sysemu/sysemu.h"
> +
> +#include "hw/i386/x86.h"
> +#include "hw/i386/pc.h"

Just a note, could we remove the pc.h inclusion here?



^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 08/10] roms: add microvm-bios (qboot) as binary and git submodule
  2019-10-04  9:37 ` [PATCH v6 08/10] roms: add microvm-bios (qboot) as binary and git submodule Sergio Lopez
@ 2019-10-04 11:50   ` Stefano Garzarella
  0 siblings, 0 replies; 31+ messages in thread
From: Stefano Garzarella @ 2019-10-04 11:50 UTC (permalink / raw)
  To: Sergio Lopez
  Cc: ehabkost, mst, lersek, qemu-devel, kraxel, pbonzini, imammedo,
	philmd, rth

On Fri, Oct 04, 2019 at 11:37:50AM +0200, Sergio Lopez wrote:
> qboot is a minimalist x86 firmware for booting Linux kernels. It does
> the mininum amount of work required for the task, and it's able to
> boot both PVH images and bzImages without relying on option roms.
> 
> This characteristics make it an ideal companion for the microvm
> machine type.
> 
> Signed-off-by: Sergio Lopez <slp@redhat.com>
> ---
>  .gitmodules              |   3 +++
>  pc-bios/bios-microvm.bin | Bin 0 -> 65536 bytes
>  roms/Makefile            |   6 ++++++
>  roms/qboot               |   1 +
>  4 files changed, 10 insertions(+)
>  create mode 100755 pc-bios/bios-microvm.bin
>  create mode 160000 roms/qboot
> 
> diff --git a/.gitmodules b/.gitmodules
> index c5c474169d..19792c9a11 100644
> --- a/.gitmodules
> +++ b/.gitmodules
> @@ -58,3 +58,6 @@
>  [submodule "roms/opensbi"]
>  	path = roms/opensbi
>  	url = 	https://git.qemu.org/git/opensbi.git
> +[submodule "roms/qboot"]
> +	path = roms/qboot
> +	url = https://github.com/bonzini/qboot
> diff --git a/pc-bios/bios-microvm.bin b/pc-bios/bios-microvm.bin
> new file mode 100755

Others rom have 644 permissions, should we be aligned or doesn't matter?

Anyway this patch LGTM:

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 00/10] Introduce the microvm machine type
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (9 preceding siblings ...)
  2019-10-04  9:37 ` [PATCH v6 10/10] hw/i386: Introduce the " Sergio Lopez
@ 2019-10-04 13:57 ` Michael S. Tsirkin
  2019-10-04 17:21 ` no-reply
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 31+ messages in thread
From: Michael S. Tsirkin @ 2019-10-04 13:57 UTC (permalink / raw)
  To: Sergio Lopez
  Cc: ehabkost, lersek, qemu-devel, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

On Fri, Oct 04, 2019 at 11:37:42AM +0200, Sergio Lopez wrote:
> Microvm is a machine type inspired by Firecracker and constructed
> after the its machine model.
> 
> It's a minimalist machine type without PCI nor ACPI support, designed
> for short-lived guests. Microvm also establishes a baseline for
> benchmarking and optimizing both QEMU and guest operating systems,
> since it is optimized for both boot time and footprint.
> 
> ---
> 
> Changelog
> v6:
>  - Some style fixes (Philippe Mathieu-Daudé)
>  - Fix a documentation bug stating that LAPIC was in userspace (Paolo
>    Bonzini)
>  - Update Xen HVM code after X86MachineState introduction (Philippe
>    Mathieu-Daudé)
>  - Rename header guard from QEMU_VIRTIO_MMIO_H to HW_VIRTIO_MMIO_H
>    (Philippe Mathieu-Daudé)
> 
> v5:
>  - Drop unneeded "[PATCH v4 2/8] hw/i386: Factorize e820 related
>    functions" (Philippe Mathieu-Daudé)
>  - Drop unneeded "[PATCH v4 1/8] hw/i386: Factorize PVH related
>    functions" (Stefano Garzarella)
>  - Split X86MachineState introduction into smaller patches (Philippe
>    Mathieu-Daudé)
>  - Change option-roms to x-option-roms and kernel-cmdline to
>    auto-kernel-cmdline (Paolo Bonzini)
>  - Make i8259 PIT and i8254 PIC optional (Paolo Bonzini)
>  - Some fixes to the documentation (Paolo Bonzini)
>  - Switch documentation format from txt to rst (Peter Maydell)
>  - Move NMI interface to X86_MACHINE (Philippe Mathieu-Daudé, Paolo
>    Bonzini)
> 
> v4:
>  - This is a complete rewrite of the whole patchset, with a focus on
>    reusing as much existing code as possible to ease the maintenance burden
>    and making the machine type as compatible as possible by default. As
>    a result, the number of lines dedicated specifically to microvm is
>    383 (code lines measured by "cloc") and, with the default
>    configuration, it's now able to boot both PVH ELF images and
>    bzImages with either SeaBIOS or qboot.
> 
> v3:
>   - Add initrd support (thanks Stefano).
> 
> v2:
>   - Drop "[PATCH 1/4] hw/i386: Factorize CPU routine".
>   - Simplify machine definition (thanks Eduardo).
>   - Remove use of unneeded NUMA-related callbacks (thanks Eduardo).
>   - Add a patch to factorize PVH-related functions.
>   - Replace use of Linux's Zero Page with PVH (thanks Maran and Paolo).
> 
> ---
> Sergio Lopez (10):
>   hw/virtio: Factorize virtio-mmio headers
>   hw/i386/pc: rename functions shared with non-PC machines
>   hw/i386/pc: move shared x86 functions to x86.c and export them
>   hw/i386: split PCMachineState deriving X86MachineState from it
>   hw/i386: make x86.c independent from PCMachineState
>   fw_cfg: add "modify" functions for all types
>   hw/intc/apic: reject pic ints if isa_pic == NULL
>   roms: add microvm-bios (qboot) as binary and git submodule
>   docs/microvm.rst: document the new microvm machine type
>   hw/i386: Introduce the microvm machine type
> 
>  docs/microvm.rst                 |  98 ++++
>  default-configs/i386-softmmu.mak |   1 +
>  include/hw/i386/microvm.h        |  83 ++++
>  include/hw/i386/pc.h             |  28 +-
>  include/hw/i386/x86.h            |  94 ++++
>  include/hw/nvram/fw_cfg.h        |  42 ++
>  include/hw/virtio/virtio-mmio.h  |  73 +++
>  hw/acpi/cpu_hotplug.c            |  10 +-
>  hw/i386/acpi-build.c             |  29 +-
>  hw/i386/amd_iommu.c              |   3 +-
>  hw/i386/intel_iommu.c            |   3 +-
>  hw/i386/microvm.c                | 574 ++++++++++++++++++++++
>  hw/i386/pc.c                     | 780 +++---------------------------
>  hw/i386/pc_piix.c                |  46 +-
>  hw/i386/pc_q35.c                 |  38 +-
>  hw/i386/pc_sysfw.c               |  58 +--
>  hw/i386/x86.c                    | 790 +++++++++++++++++++++++++++++++
>  hw/i386/xen/xen-hvm.c            |  23 +-
>  hw/intc/apic.c                   |   2 +-
>  hw/intc/ioapic.c                 |   2 +-
>  hw/nvram/fw_cfg.c                |  29 ++
>  hw/virtio/virtio-mmio.c          |  48 +-
>  .gitmodules                      |   3 +
>  hw/i386/Kconfig                  |   4 +
>  hw/i386/Makefile.objs            |   2 +
>  pc-bios/bios-microvm.bin         | Bin 0 -> 65536 bytes
>  roms/Makefile                    |   6 +
>  roms/qboot                       |   1 +

So I guess we want to add new files for x86 machines MAINTAINERS entry.

Otherwise looks good.
Which tree is this going to be merged through? Mine?


>  28 files changed, 1963 insertions(+), 907 deletions(-)
>  create mode 100644 docs/microvm.rst
>  create mode 100644 include/hw/i386/microvm.h
>  create mode 100644 include/hw/i386/x86.h
>  create mode 100644 include/hw/virtio/virtio-mmio.h
>  create mode 100644 hw/i386/microvm.c
>  create mode 100644 hw/i386/x86.c
>  create mode 100755 pc-bios/bios-microvm.bin
>  create mode 160000 roms/qboot
> 
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 00/10] Introduce the microvm machine type
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (10 preceding siblings ...)
  2019-10-04 13:57 ` [PATCH v6 00/10] " Michael S. Tsirkin
@ 2019-10-04 17:21 ` no-reply
  2019-10-08 12:37   ` Paolo Bonzini
  2019-10-04 17:22 ` no-reply
  2019-10-05 22:08 ` Michael S. Tsirkin
  13 siblings, 1 reply; 31+ messages in thread
From: no-reply @ 2019-10-04 17:21 UTC (permalink / raw)
  To: slp
  Cc: ehabkost, slp, mst, philmd, qemu-devel, kraxel, imammedo,
	pbonzini, rth, lersek, sgarzare

Patchew URL: https://patchew.org/QEMU/20191004093752.16564-1-slp@redhat.com/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

qemu-system-x86_64: missing kernel image file name, required by microvm
Broken pipe
/tmp/qemu-test/src/tests/libqtest.c:140: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0)
ERROR - too few tests run (expected 7, got 4)
make: *** [check-qtest-x86_64] Error 1
make: *** Waiting for unfinished jobs....
  TEST    iotest-qcow2: 159
  TEST    iotest-qcow2: 161
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=479a42ee8d764327bfb3924069c8e2dc', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-0dfb_dst/src/docker-src.2019-10-04-13.10.50.4894:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=479a42ee8d764327bfb3924069c8e2dc
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-0dfb_dst/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    10m42.831s
user    0m8.500s


The full log is available at
http://patchew.org/logs/20191004093752.16564-1-slp@redhat.com/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 00/10] Introduce the microvm machine type
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (11 preceding siblings ...)
  2019-10-04 17:21 ` no-reply
@ 2019-10-04 17:22 ` no-reply
  2019-10-05 22:08 ` Michael S. Tsirkin
  13 siblings, 0 replies; 31+ messages in thread
From: no-reply @ 2019-10-04 17:22 UTC (permalink / raw)
  To: slp
  Cc: ehabkost, slp, mst, philmd, qemu-devel, kraxel, imammedo,
	pbonzini, rth, lersek, sgarzare

Patchew URL: https://patchew.org/QEMU/20191004093752.16564-1-slp@redhat.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Type: series
Message-id: 20191004093752.16564-1-slp@redhat.com
Subject: [PATCH v6 00/10] Introduce the microvm machine type

=== TEST SCRIPT BEGIN ===
#!/bin/bash
git rev-parse base > /dev/null || exit 0
git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram
./scripts/checkpatch.pl --mailback base..
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
From https://github.com/patchew-project/qemu
 * [new tag]         patchew/20191004171204.21040-1-eric.auger@redhat.com -> patchew/20191004171204.21040-1-eric.auger@redhat.com
Switched to a new branch 'test'
82de93f hw/i386: Introduce the microvm machine type
fda0032 docs/microvm.rst: document the new microvm machine type
8dc483d roms: add microvm-bios (qboot) as binary and git submodule
16f12bc hw/intc/apic: reject pic ints if isa_pic == NULL
22f8cab fw_cfg: add "modify" functions for all types
7cdaa3f hw/i386: make x86.c independent from PCMachineState
052084d hw/i386: split PCMachineState deriving X86MachineState from it
afc0d5a hw/i386/pc: move shared x86 functions to x86.c and export them
9c1dc683 hw/i386/pc: rename functions shared with non-PC machines
bd6947a hw/virtio: Factorize virtio-mmio headers

=== OUTPUT BEGIN ===
1/10 Checking commit bd6947a2e366 (hw/virtio: Factorize virtio-mmio headers)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#77: 
new file mode 100644

total: 0 errors, 1 warnings, 131 lines checked

Patch 1/10 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
2/10 Checking commit 9c1dc683f829 (hw/i386/pc: rename functions shared with non-PC machines)
3/10 Checking commit afc0d5a54977 (hw/i386/pc: move shared x86 functions to x86.c and export them)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#749: 
new file mode 100644

WARNING: Block comments use a leading /* on a separate line
#809: FILE: hw/i386/x86.c:56:
+/* Calculates initial APIC ID for a specific CPU index

WARNING: Block comments use a leading /* on a separate line
#866: FILE: hw/i386/x86.c:113:
+    /* Calculates the limit to CPU APIC ID values

WARNING: Block comments should align the * on each line
#913: FILE: hw/i386/x86.c:160:
+         * -smp hasn't been parsed after it
+        */

WARNING: line over 80 characters
#926: FILE: hw/i386/x86.c:173:
+        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);

ERROR: spaces required around that '+' (ctx:VxV)
#1087: FILE: hw/i386/x86.c:334:
+    cmdline_size = (strlen(kernel_cmdline)+16) & ~15;
                                           ^

ERROR: do not use assignment in if condition
#1091: FILE: hw/i386/x86.c:338:
+    if (!f || !(kernel_size = get_file_size(f)) ||

ERROR: if this code is redundant consider removing it
#1100: FILE: hw/i386/x86.c:347:
+#if 0

ERROR: spaces required around that '+' (ctx:VxV)
#1101: FILE: hw/i386/x86.c:348:
+    fprintf(stderr, "header magic: %#x\n", ldl_p(header+0x202));
                                                        ^

ERROR: spaces required around that '+' (ctx:VxV)
#1103: FILE: hw/i386/x86.c:350:
+    if (ldl_p(header+0x202) == 0x53726448) {
                     ^

ERROR: spaces required around that '+' (ctx:VxV)
#1104: FILE: hw/i386/x86.c:351:
+        protocol = lduw_p(header+0x206);
                                 ^

ERROR: if this code is redundant consider removing it
#1194: FILE: hw/i386/x86.c:441:
+#if 0

ERROR: spaces required around that '+' (ctx:VxV)
#1206: FILE: hw/i386/x86.c:453:
+        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
                      ^

ERROR: spaces required around that '+' (ctx:VxV)
#1225: FILE: hw/i386/x86.c:472:
+        initrd_max = ldl_p(header+0x22c);
                                  ^

ERROR: spaces required around that '+' (ctx:VxV)
#1235: FILE: hw/i386/x86.c:482:
+    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline)+1);
                                                                       ^

ERROR: spaces required around that '+' (ctx:VxV)
#1239: FILE: hw/i386/x86.c:486:
+        stl_p(header+0x228, cmdline_addr);
                     ^

ERROR: spaces required around that '+' (ctx:VxV)
#1241: FILE: hw/i386/x86.c:488:
+        stw_p(header+0x20, 0xA33F);
                     ^

ERROR: spaces required around that '+' (ctx:VxV)
#1242: FILE: hw/i386/x86.c:489:
+        stw_p(header+0x22, cmdline_addr-real_addr);
                     ^

ERROR: spaces required around that '-' (ctx:VxV)
#1242: FILE: hw/i386/x86.c:489:
+        stw_p(header+0x22, cmdline_addr-real_addr);
                                        ^

ERROR: consider using qemu_strtol in preference to strtol
#1258: FILE: hw/i386/x86.c:505:
+            video_mode = strtol(vmode, NULL, 0);

ERROR: spaces required around that '+' (ctx:VxV)
#1260: FILE: hw/i386/x86.c:507:
+        stw_p(header+0x1fa, video_mode);
                     ^

WARNING: Block comments use a leading /* on a separate line
#1264: FILE: hw/i386/x86.c:511:
+    /* High nybble = B reserved for QEMU; low nybble is revision number.

WARNING: Block comments use * on subsequent lines
#1265: FILE: hw/i386/x86.c:512:
+    /* High nybble = B reserved for QEMU; low nybble is revision number.
+       If this code is substantially changed, you may want to consider

WARNING: Block comments use a trailing */ on a separate line
#1266: FILE: hw/i386/x86.c:513:
+       incrementing the revision. */

ERROR: code indent should never use tabs
#1272: FILE: hw/i386/x86.c:519:
+        header[0x211] |= 0x80;^I/* CAN_USE_HEAP */$

ERROR: spaces required around that '+' (ctx:VxV)
#1273: FILE: hw/i386/x86.c:520:
+        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
                     ^

ERROR: spaces required around that '-' (ctx:VxV)
#1273: FILE: hw/i386/x86.c:520:
+        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
                                         ^

ERROR: spaces required around that '-' (ctx:VxV)
#1273: FILE: hw/i386/x86.c:520:
+        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
                                                   ^

ERROR: spaces required around that '-' (ctx:VxV)
#1305: FILE: hw/i386/x86.c:552:
+        initrd_addr = (initrd_max-initrd_size) & ~4095;
                                  ^

ERROR: spaces required around that '+' (ctx:VxV)
#1311: FILE: hw/i386/x86.c:558:
+        stl_p(header+0x218, initrd_addr);
                     ^

ERROR: spaces required around that '+' (ctx:VxV)
#1312: FILE: hw/i386/x86.c:559:
+        stl_p(header+0x21c, initrd_size);
                     ^

ERROR: spaces required around that '+' (ctx:VxV)
#1320: FILE: hw/i386/x86.c:567:
+    setup_size = (setup_size+1)*512;
                             ^

ERROR: spaces required around that '*' (ctx:VxV)
#1320: FILE: hw/i386/x86.c:567:
+    setup_size = (setup_size+1)*512;
                                ^

ERROR: spaces required around that '+' (ctx:VxV)
#1358: FILE: hw/i386/x86.c:605:
+        stq_p(header+0x250, prot_addr + setup_data_offset);
                     ^

total: 26 errors, 8 warnings, 1430 lines checked

Patch 3/10 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

4/10 Checking commit 052084dca6ed (hw/i386: split PCMachineState deriving X86MachineState from it)
WARNING: Block comments use a leading /* on a separate line
#880: FILE: hw/i386/pc_q35.c:158:
+        x86ms->max_ram_below_4g = 1ULL << 32; /* default: 4G */;

WARNING: line over 80 characters
#1103: FILE: hw/i386/x86.c:420:
+                initrd_max = x86ms->below_4g_mem_size - pcmc->acpi_data_size - 1;

WARNING: line over 80 characters
#1232: FILE: hw/i386/xen/xen-hvm.c:204:
+                                                    X86_MACHINE_MAX_RAM_BELOW_4G,

WARNING: Block comments use a leading /* on a separate line
#1435: FILE: include/hw/i386/x86.h:61:
+    /* Address space used by IOAPIC device. All IOAPIC interrupts

WARNING: Block comments use a trailing */ on a separate line
#1436: FILE: include/hw/i386/x86.h:62:
+     * will be translated to MSI messages in the address space. */

total: 0 errors, 5 warnings, 1296 lines checked

Patch 4/10 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
5/10 Checking commit 7cdaa3f2e445 (hw/i386: make x86.c independent from PCMachineState)
WARNING: line over 80 characters
#178: FILE: hw/i386/x86.c:173:
+        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(x86ms, i);

total: 0 errors, 1 warnings, 220 lines checked

Patch 5/10 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
6/10 Checking commit 22f8cab11248 (fw_cfg: add "modify" functions for all types)
7/10 Checking commit 16f12bca2dce (hw/intc/apic: reject pic ints if isa_pic == NULL)
8/10 Checking commit 8dc483dafc50 (roms: add microvm-bios (qboot) as binary and git submodule)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#29: 
new file mode 100755

total: 0 errors, 1 warnings, 28 lines checked

Patch 8/10 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
9/10 Checking commit fda00320f641 (docs/microvm.rst: document the new microvm machine type)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#13: 
new file mode 100644

total: 0 errors, 1 warnings, 98 lines checked

Patch 9/10 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
10/10 Checking commit 82de93f5898c (hw/i386: Introduce the microvm machine type)
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#55: 
new file mode 100644

total: 0 errors, 1 warnings, 678 lines checked

Patch 10/10 has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/20191004093752.16564-1-slp@redhat.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 00/10] Introduce the microvm machine type
  2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
                   ` (12 preceding siblings ...)
  2019-10-04 17:22 ` no-reply
@ 2019-10-05 22:08 ` Michael S. Tsirkin
  2019-10-07 13:44   ` Sergio Lopez
  13 siblings, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2019-10-05 22:08 UTC (permalink / raw)
  To: Sergio Lopez
  Cc: ehabkost, lersek, qemu-devel, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

On Fri, Oct 04, 2019 at 11:37:42AM +0200, Sergio Lopez wrote:
> Microvm is a machine type inspired by Firecracker and constructed
> after the its machine model.
> 
> It's a minimalist machine type without PCI nor ACPI support, designed
> for short-lived guests. Microvm also establishes a baseline for
> benchmarking and optimizing both QEMU and guest operating systems,
> since it is optimized for both boot time and footprint.

Pls take a look at patchew warnings and errors.
Both coding style issues and test failures need to be
addressed somehow I think.

> ---
> 
> Changelog
> v6:
>  - Some style fixes (Philippe Mathieu-Daudé)
>  - Fix a documentation bug stating that LAPIC was in userspace (Paolo
>    Bonzini)
>  - Update Xen HVM code after X86MachineState introduction (Philippe
>    Mathieu-Daudé)
>  - Rename header guard from QEMU_VIRTIO_MMIO_H to HW_VIRTIO_MMIO_H
>    (Philippe Mathieu-Daudé)
> 
> v5:
>  - Drop unneeded "[PATCH v4 2/8] hw/i386: Factorize e820 related
>    functions" (Philippe Mathieu-Daudé)
>  - Drop unneeded "[PATCH v4 1/8] hw/i386: Factorize PVH related
>    functions" (Stefano Garzarella)
>  - Split X86MachineState introduction into smaller patches (Philippe
>    Mathieu-Daudé)
>  - Change option-roms to x-option-roms and kernel-cmdline to
>    auto-kernel-cmdline (Paolo Bonzini)
>  - Make i8259 PIT and i8254 PIC optional (Paolo Bonzini)
>  - Some fixes to the documentation (Paolo Bonzini)
>  - Switch documentation format from txt to rst (Peter Maydell)
>  - Move NMI interface to X86_MACHINE (Philippe Mathieu-Daudé, Paolo
>    Bonzini)
> 
> v4:
>  - This is a complete rewrite of the whole patchset, with a focus on
>    reusing as much existing code as possible to ease the maintenance burden
>    and making the machine type as compatible as possible by default. As
>    a result, the number of lines dedicated specifically to microvm is
>    383 (code lines measured by "cloc") and, with the default
>    configuration, it's now able to boot both PVH ELF images and
>    bzImages with either SeaBIOS or qboot.
> 
> v3:
>   - Add initrd support (thanks Stefano).
> 
> v2:
>   - Drop "[PATCH 1/4] hw/i386: Factorize CPU routine".
>   - Simplify machine definition (thanks Eduardo).
>   - Remove use of unneeded NUMA-related callbacks (thanks Eduardo).
>   - Add a patch to factorize PVH-related functions.
>   - Replace use of Linux's Zero Page with PVH (thanks Maran and Paolo).
> 
> ---
> Sergio Lopez (10):
>   hw/virtio: Factorize virtio-mmio headers
>   hw/i386/pc: rename functions shared with non-PC machines
>   hw/i386/pc: move shared x86 functions to x86.c and export them
>   hw/i386: split PCMachineState deriving X86MachineState from it
>   hw/i386: make x86.c independent from PCMachineState
>   fw_cfg: add "modify" functions for all types
>   hw/intc/apic: reject pic ints if isa_pic == NULL
>   roms: add microvm-bios (qboot) as binary and git submodule
>   docs/microvm.rst: document the new microvm machine type
>   hw/i386: Introduce the microvm machine type
> 
>  docs/microvm.rst                 |  98 ++++
>  default-configs/i386-softmmu.mak |   1 +
>  include/hw/i386/microvm.h        |  83 ++++
>  include/hw/i386/pc.h             |  28 +-
>  include/hw/i386/x86.h            |  94 ++++
>  include/hw/nvram/fw_cfg.h        |  42 ++
>  include/hw/virtio/virtio-mmio.h  |  73 +++
>  hw/acpi/cpu_hotplug.c            |  10 +-
>  hw/i386/acpi-build.c             |  29 +-
>  hw/i386/amd_iommu.c              |   3 +-
>  hw/i386/intel_iommu.c            |   3 +-
>  hw/i386/microvm.c                | 574 ++++++++++++++++++++++
>  hw/i386/pc.c                     | 780 +++---------------------------
>  hw/i386/pc_piix.c                |  46 +-
>  hw/i386/pc_q35.c                 |  38 +-
>  hw/i386/pc_sysfw.c               |  58 +--
>  hw/i386/x86.c                    | 790 +++++++++++++++++++++++++++++++
>  hw/i386/xen/xen-hvm.c            |  23 +-
>  hw/intc/apic.c                   |   2 +-
>  hw/intc/ioapic.c                 |   2 +-
>  hw/nvram/fw_cfg.c                |  29 ++
>  hw/virtio/virtio-mmio.c          |  48 +-
>  .gitmodules                      |   3 +
>  hw/i386/Kconfig                  |   4 +
>  hw/i386/Makefile.objs            |   2 +
>  pc-bios/bios-microvm.bin         | Bin 0 -> 65536 bytes
>  roms/Makefile                    |   6 +
>  roms/qboot                       |   1 +
>  28 files changed, 1963 insertions(+), 907 deletions(-)
>  create mode 100644 docs/microvm.rst
>  create mode 100644 include/hw/i386/microvm.h
>  create mode 100644 include/hw/i386/x86.h
>  create mode 100644 include/hw/virtio/virtio-mmio.h
>  create mode 100644 hw/i386/microvm.c
>  create mode 100644 hw/i386/x86.c
>  create mode 100755 pc-bios/bios-microvm.bin
>  create mode 160000 roms/qboot
> 
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 00/10] Introduce the microvm machine type
  2019-10-05 22:08 ` Michael S. Tsirkin
@ 2019-10-07 13:44   ` Sergio Lopez
  2019-10-07 13:59     ` Philippe Mathieu-Daudé
  2019-10-09 19:21     ` Michael S. Tsirkin
  0 siblings, 2 replies; 31+ messages in thread
From: Sergio Lopez @ 2019-10-07 13:44 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: ehabkost, lersek, qemu-devel, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

[-- Attachment #1: Type: text/plain, Size: 5585 bytes --]


Michael S. Tsirkin <mst@redhat.com> writes:

> On Fri, Oct 04, 2019 at 11:37:42AM +0200, Sergio Lopez wrote:
>> Microvm is a machine type inspired by Firecracker and constructed
>> after the its machine model.
>> 
>> It's a minimalist machine type without PCI nor ACPI support, designed
>> for short-lived guests. Microvm also establishes a baseline for
>> benchmarking and optimizing both QEMU and guest operating systems,
>> since it is optimized for both boot time and footprint.
>
> Pls take a look at patchew warnings and errors.
> Both coding style issues and test failures need to be
> addressed somehow I think.

I've fixed the issue with the test suite, but I'm not sure what to do
about the coding style errors. Every one of them (except perhaps one at
xen-hvm.c) comes from code I've moved from pc.c to x86.c. I'd say fixing
those are outside the scope of the corresponding patches, but please
correct me if I'm wrong.

On the other hand, I haven't touched MAINTAINERS, because I'm not sure
about the actual policies that apply while doing so. Should I add the
new files to it?

Thanks,
Sergio.

>> ---
>> 
>> Changelog
>> v6:
>>  - Some style fixes (Philippe Mathieu-Daudé)
>>  - Fix a documentation bug stating that LAPIC was in userspace (Paolo
>>    Bonzini)
>>  - Update Xen HVM code after X86MachineState introduction (Philippe
>>    Mathieu-Daudé)
>>  - Rename header guard from QEMU_VIRTIO_MMIO_H to HW_VIRTIO_MMIO_H
>>    (Philippe Mathieu-Daudé)
>> 
>> v5:
>>  - Drop unneeded "[PATCH v4 2/8] hw/i386: Factorize e820 related
>>    functions" (Philippe Mathieu-Daudé)
>>  - Drop unneeded "[PATCH v4 1/8] hw/i386: Factorize PVH related
>>    functions" (Stefano Garzarella)
>>  - Split X86MachineState introduction into smaller patches (Philippe
>>    Mathieu-Daudé)
>>  - Change option-roms to x-option-roms and kernel-cmdline to
>>    auto-kernel-cmdline (Paolo Bonzini)
>>  - Make i8259 PIT and i8254 PIC optional (Paolo Bonzini)
>>  - Some fixes to the documentation (Paolo Bonzini)
>>  - Switch documentation format from txt to rst (Peter Maydell)
>>  - Move NMI interface to X86_MACHINE (Philippe Mathieu-Daudé, Paolo
>>    Bonzini)
>> 
>> v4:
>>  - This is a complete rewrite of the whole patchset, with a focus on
>>    reusing as much existing code as possible to ease the maintenance burden
>>    and making the machine type as compatible as possible by default. As
>>    a result, the number of lines dedicated specifically to microvm is
>>    383 (code lines measured by "cloc") and, with the default
>>    configuration, it's now able to boot both PVH ELF images and
>>    bzImages with either SeaBIOS or qboot.
>> 
>> v3:
>>   - Add initrd support (thanks Stefano).
>> 
>> v2:
>>   - Drop "[PATCH 1/4] hw/i386: Factorize CPU routine".
>>   - Simplify machine definition (thanks Eduardo).
>>   - Remove use of unneeded NUMA-related callbacks (thanks Eduardo).
>>   - Add a patch to factorize PVH-related functions.
>>   - Replace use of Linux's Zero Page with PVH (thanks Maran and Paolo).
>> 
>> ---
>> Sergio Lopez (10):
>>   hw/virtio: Factorize virtio-mmio headers
>>   hw/i386/pc: rename functions shared with non-PC machines
>>   hw/i386/pc: move shared x86 functions to x86.c and export them
>>   hw/i386: split PCMachineState deriving X86MachineState from it
>>   hw/i386: make x86.c independent from PCMachineState
>>   fw_cfg: add "modify" functions for all types
>>   hw/intc/apic: reject pic ints if isa_pic == NULL
>>   roms: add microvm-bios (qboot) as binary and git submodule
>>   docs/microvm.rst: document the new microvm machine type
>>   hw/i386: Introduce the microvm machine type
>> 
>>  docs/microvm.rst                 |  98 ++++
>>  default-configs/i386-softmmu.mak |   1 +
>>  include/hw/i386/microvm.h        |  83 ++++
>>  include/hw/i386/pc.h             |  28 +-
>>  include/hw/i386/x86.h            |  94 ++++
>>  include/hw/nvram/fw_cfg.h        |  42 ++
>>  include/hw/virtio/virtio-mmio.h  |  73 +++
>>  hw/acpi/cpu_hotplug.c            |  10 +-
>>  hw/i386/acpi-build.c             |  29 +-
>>  hw/i386/amd_iommu.c              |   3 +-
>>  hw/i386/intel_iommu.c            |   3 +-
>>  hw/i386/microvm.c                | 574 ++++++++++++++++++++++
>>  hw/i386/pc.c                     | 780 +++---------------------------
>>  hw/i386/pc_piix.c                |  46 +-
>>  hw/i386/pc_q35.c                 |  38 +-
>>  hw/i386/pc_sysfw.c               |  58 +--
>>  hw/i386/x86.c                    | 790 +++++++++++++++++++++++++++++++
>>  hw/i386/xen/xen-hvm.c            |  23 +-
>>  hw/intc/apic.c                   |   2 +-
>>  hw/intc/ioapic.c                 |   2 +-
>>  hw/nvram/fw_cfg.c                |  29 ++
>>  hw/virtio/virtio-mmio.c          |  48 +-
>>  .gitmodules                      |   3 +
>>  hw/i386/Kconfig                  |   4 +
>>  hw/i386/Makefile.objs            |   2 +
>>  pc-bios/bios-microvm.bin         | Bin 0 -> 65536 bytes
>>  roms/Makefile                    |   6 +
>>  roms/qboot                       |   1 +
>>  28 files changed, 1963 insertions(+), 907 deletions(-)
>>  create mode 100644 docs/microvm.rst
>>  create mode 100644 include/hw/i386/microvm.h
>>  create mode 100644 include/hw/i386/x86.h
>>  create mode 100644 include/hw/virtio/virtio-mmio.h
>>  create mode 100644 hw/i386/microvm.c
>>  create mode 100644 hw/i386/x86.c
>>  create mode 100755 pc-bios/bios-microvm.bin
>>  create mode 160000 roms/qboot
>> 
>> -- 
>> 2.21.0


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them
  2019-10-04 11:36   ` Stefano Garzarella
@ 2019-10-07 13:46     ` Sergio Lopez
  0 siblings, 0 replies; 31+ messages in thread
From: Sergio Lopez @ 2019-10-07 13:46 UTC (permalink / raw)
  To: Stefano Garzarella
  Cc: ehabkost, mst, lersek, qemu-devel, kraxel, pbonzini, imammedo,
	philmd, rth

[-- Attachment #1: Type: text/plain, Size: 33065 bytes --]


Stefano Garzarella <sgarzare@redhat.com> writes:

> On Fri, Oct 04, 2019 at 11:37:45AM +0200, Sergio Lopez wrote:
>> Move x86 functions that will be shared between PC and non-PC machine
>> types to x86.c, along with their helpers.
>> 
>> Signed-off-by: Sergio Lopez <slp@redhat.com>
>> ---
>>  include/hw/i386/pc.h  |   1 -
>>  include/hw/i386/x86.h |  35 +++
>>  hw/i386/pc.c          | 582 +----------------------------------
>>  hw/i386/pc_piix.c     |   1 +
>>  hw/i386/pc_q35.c      |   1 +
>>  hw/i386/pc_sysfw.c    |  54 +---
>>  hw/i386/x86.c         | 684 ++++++++++++++++++++++++++++++++++++++++++
>>  hw/i386/Makefile.objs |   1 +
>>  8 files changed, 724 insertions(+), 635 deletions(-)
>>  create mode 100644 include/hw/i386/x86.h
>>  create mode 100644 hw/i386/x86.c
>> 
>> diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h
>> index d12f42e9e5..73e2847e87 100644
>> --- a/include/hw/i386/pc.h
>> +++ b/include/hw/i386/pc.h
>> @@ -195,7 +195,6 @@ bool pc_machine_is_smm_enabled(PCMachineState *pcms);
>>  void pc_register_ferr_irq(qemu_irq irq);
>>  void pc_acpi_smi_interrupt(void *opaque, int irq, int level);
>>  
>> -void x86_cpus_init(PCMachineState *pcms);
>>  void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp);
>>  void pc_smp_parse(MachineState *ms, QemuOpts *opts);
>>  
>> diff --git a/include/hw/i386/x86.h b/include/hw/i386/x86.h
>> new file mode 100644
>> index 0000000000..71e2b6985d
>> --- /dev/null
>> +++ b/include/hw/i386/x86.h
>> @@ -0,0 +1,35 @@
>> +/*
>> + * Copyright (c) 2019 Red Hat, Inc.
>> + *
>> + * This program is free software; you can redistribute it and/or modify it
>> + * under the terms and conditions of the GNU General Public License,
>> + * version 2 or later, as published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope it will be useful, but WITHOUT
>> + * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
>> + * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
>> + * more details.
>> + *
>> + * You should have received a copy of the GNU General Public License along with
>> + * this program.  If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#ifndef HW_I386_X86_H
>> +#define HW_I386_X86_H
>> +
>> +#include "hw/boards.h"
>> +
>> +uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
>> +                                    unsigned int cpu_index);
>> +void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp);
>> +void x86_cpus_init(PCMachineState *pcms);
>> +CpuInstanceProperties x86_cpu_index_to_props(MachineState *ms,
>> +                                             unsigned cpu_index);
>> +int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx);
>> +const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms);
>> +
>> +void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw);
>> +
>> +void x86_load_linux(PCMachineState *x86ms, FWCfgState *fw_cfg);
>> +
>> +#endif
>> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
>> index fd08c6704b..094db79fb0 100644
>> --- a/hw/i386/pc.c
>> +++ b/hw/i386/pc.c
>> @@ -24,6 +24,7 @@
>>  
>>  #include "qemu/osdep.h"
>>  #include "qemu/units.h"
>> +#include "hw/i386/x86.h"
>>  #include "hw/i386/pc.h"
>>  #include "hw/char/serial.h"
>>  #include "hw/char/parallel.h"
>> @@ -102,9 +103,6 @@
>>  
>>  struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
>>  
>> -/* Physical Address of PVH entry point read from kernel ELF NOTE */
>> -static size_t pvh_start_addr;
>> -
>>  GlobalProperty pc_compat_4_1[] = {};
>>  const size_t pc_compat_4_1_len = G_N_ELEMENTS(pc_compat_4_1);
>>  
>> @@ -866,478 +864,6 @@ static void handle_a20_line_change(void *opaque, int irq, int level)
>>      x86_cpu_set_a20(cpu, level);
>>  }
>>  
>> -/* Calculates initial APIC ID for a specific CPU index
>> - *
>> - * Currently we need to be able to calculate the APIC ID from the CPU index
>> - * alone (without requiring a CPU object), as the QEMU<->Seabios interfaces have
>> - * no concept of "CPU index", and the NUMA tables on fw_cfg need the APIC ID of
>> - * all CPUs up to max_cpus.
>> - */
>> -static uint32_t x86_cpu_apic_id_from_index(PCMachineState *pcms,
>> -                                           unsigned int cpu_index)
>> -{
>> -    MachineState *ms = MACHINE(pcms);
>> -    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
>> -    uint32_t correct_id;
>> -    static bool warned;
>> -
>> -    correct_id = x86_apicid_from_cpu_idx(pcms->smp_dies, ms->smp.cores,
>> -                                         ms->smp.threads, cpu_index);
>> -    if (pcmc->compat_apic_id_mode) {
>> -        if (cpu_index != correct_id && !warned && !qtest_enabled()) {
>> -            error_report("APIC IDs set in compatibility mode, "
>> -                         "CPU topology won't match the configuration");
>> -            warned = true;
>> -        }
>> -        return cpu_index;
>> -    } else {
>> -        return correct_id;
>> -    }
>> -}
>> -
>> -static long get_file_size(FILE *f)
>> -{
>> -    long where, size;
>> -
>> -    /* XXX: on Unix systems, using fstat() probably makes more sense */
>> -
>> -    where = ftell(f);
>> -    fseek(f, 0, SEEK_END);
>> -    size = ftell(f);
>> -    fseek(f, where, SEEK_SET);
>> -
>> -    return size;
>> -}
>> -
>> -struct setup_data {
>> -    uint64_t next;
>> -    uint32_t type;
>> -    uint32_t len;
>> -    uint8_t data[0];
>> -} __attribute__((packed));
>> -
>> -
>> -/*
>> - * The entry point into the kernel for PVH boot is different from
>> - * the native entry point.  The PVH entry is defined by the x86/HVM
>> - * direct boot ABI and is available in an ELFNOTE in the kernel binary.
>> - *
>> - * This function is passed to load_elf() when it is called from
>> - * load_elfboot() which then additionally checks for an ELF Note of
>> - * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
>> - * parse the PVH entry address from the ELF Note.
>> - *
>> - * Due to trickery in elf_opts.h, load_elf() is actually available as
>> - * load_elf32() or load_elf64() and this routine needs to be able
>> - * to deal with being called as 32 or 64 bit.
>> - *
>> - * The address of the PVH entry point is saved to the 'pvh_start_addr'
>> - * global variable.  (although the entry point is 32-bit, the kernel
>> - * binary can be either 32-bit or 64-bit).
>> - */
>> -static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
>> -{
>> -    size_t *elf_note_data_addr;
>> -
>> -    /* Check if ELF Note header passed in is valid */
>> -    if (arg1 == NULL) {
>> -        return 0;
>> -    }
>> -
>> -    if (is64) {
>> -        struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
>> -        uint64_t nhdr_size64 = sizeof(struct elf64_note);
>> -        uint64_t phdr_align = *(uint64_t *)arg2;
>> -        uint64_t nhdr_namesz = nhdr64->n_namesz;
>> -
>> -        elf_note_data_addr =
>> -            ((void *)nhdr64) + nhdr_size64 +
>> -            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
>> -    } else {
>> -        struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
>> -        uint32_t nhdr_size32 = sizeof(struct elf32_note);
>> -        uint32_t phdr_align = *(uint32_t *)arg2;
>> -        uint32_t nhdr_namesz = nhdr32->n_namesz;
>> -
>> -        elf_note_data_addr =
>> -            ((void *)nhdr32) + nhdr_size32 +
>> -            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
>> -    }
>> -
>> -    pvh_start_addr = *elf_note_data_addr;
>> -
>> -    return pvh_start_addr;
>> -}
>> -
>> -static bool load_elfboot(const char *kernel_filename,
>> -                   int kernel_file_size,
>> -                   uint8_t *header,
>> -                   size_t pvh_xen_start_addr,
>> -                   FWCfgState *fw_cfg)
>> -{
>> -    uint32_t flags = 0;
>> -    uint32_t mh_load_addr = 0;
>> -    uint32_t elf_kernel_size = 0;
>> -    uint64_t elf_entry;
>> -    uint64_t elf_low, elf_high;
>> -    int kernel_size;
>> -
>> -    if (ldl_p(header) != 0x464c457f) {
>> -        return false; /* no elfboot */
>> -    }
>> -
>> -    bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
>> -    flags = elf_is64 ?
>> -        ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
>> -
>> -    if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
>> -        error_report("elfboot unsupported flags = %x", flags);
>> -        exit(1);
>> -    }
>> -
>> -    uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
>> -    kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
>> -                           NULL, &elf_note_type, &elf_entry,
>> -                           &elf_low, &elf_high, 0, I386_ELF_MACHINE,
>> -                           0, 0);
>> -
>> -    if (kernel_size < 0) {
>> -        error_report("Error while loading elf kernel");
>> -        exit(1);
>> -    }
>> -    mh_load_addr = elf_low;
>> -    elf_kernel_size = elf_high - elf_low;
>> -
>> -    if (pvh_start_addr == 0) {
>> -        error_report("Error loading uncompressed kernel without PVH ELF Note");
>> -        exit(1);
>> -    }
>> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
>> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
>> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
>> -
>> -    return true;
>> -}
>> -
>> -static void x86_load_linux(PCMachineState *pcms,
>> -                           FWCfgState *fw_cfg)
>> -{
>> -    uint16_t protocol;
>> -    int setup_size, kernel_size, cmdline_size;
>> -    int dtb_size, setup_data_offset;
>> -    uint32_t initrd_max;
>> -    uint8_t header[8192], *setup, *kernel;
>> -    hwaddr real_addr, prot_addr, cmdline_addr, initrd_addr = 0;
>> -    FILE *f;
>> -    char *vmode;
>> -    MachineState *machine = MACHINE(pcms);
>> -    PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
>> -    struct setup_data *setup_data;
>> -    const char *kernel_filename = machine->kernel_filename;
>> -    const char *initrd_filename = machine->initrd_filename;
>> -    const char *dtb_filename = machine->dtb;
>> -    const char *kernel_cmdline = machine->kernel_cmdline;
>> -
>> -    /* Align to 16 bytes as a paranoia measure */
>> -    cmdline_size = (strlen(kernel_cmdline)+16) & ~15;
>> -
>> -    /* load the kernel header */
>> -    f = fopen(kernel_filename, "rb");
>> -    if (!f || !(kernel_size = get_file_size(f)) ||
>> -        fread(header, 1, MIN(ARRAY_SIZE(header), kernel_size), f) !=
>> -        MIN(ARRAY_SIZE(header), kernel_size)) {
>> -        fprintf(stderr, "qemu: could not load kernel '%s': %s\n",
>> -                kernel_filename, strerror(errno));
>> -        exit(1);
>> -    }
>> -
>> -    /* kernel protocol version */
>> -#if 0
>> -    fprintf(stderr, "header magic: %#x\n", ldl_p(header+0x202));
>> -#endif
>> -    if (ldl_p(header+0x202) == 0x53726448) {
>> -        protocol = lduw_p(header+0x206);
>> -    } else {
>> -        /*
>> -         * This could be a multiboot kernel. If it is, let's stop treating it
>> -         * like a Linux kernel.
>> -         * Note: some multiboot images could be in the ELF format (the same of
>> -         * PVH), so we try multiboot first since we check the multiboot magic
>> -         * header before to load it.
>> -         */
>> -        if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
>> -                           kernel_cmdline, kernel_size, header)) {
>> -            return;
>> -        }
>> -        /*
>> -         * Check if the file is an uncompressed kernel file (ELF) and load it,
>> -         * saving the PVH entry point used by the x86/HVM direct boot ABI.
>> -         * If load_elfboot() is successful, populate the fw_cfg info.
>> -         */
>> -        if (pcmc->pvh_enabled &&
>> -            load_elfboot(kernel_filename, kernel_size,
>> -                         header, pvh_start_addr, fw_cfg)) {
>> -            fclose(f);
>> -
>> -            fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
>> -                strlen(kernel_cmdline) + 1);
>> -            fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
>> -
>> -            fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
>> -            fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
>> -                             header, sizeof(header));
>> -
>> -            /* load initrd */
>> -            if (initrd_filename) {
>> -                GMappedFile *mapped_file;
>> -                gsize initrd_size;
>> -                gchar *initrd_data;
>> -                GError *gerr = NULL;
>> -
>> -                mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
>> -                if (!mapped_file) {
>> -                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
>> -                            initrd_filename, gerr->message);
>> -                    exit(1);
>> -                }
>> -                pcms->initrd_mapped_file = mapped_file;
>> -
>> -                initrd_data = g_mapped_file_get_contents(mapped_file);
>> -                initrd_size = g_mapped_file_get_length(mapped_file);
>> -                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
>> -                if (initrd_size >= initrd_max) {
>> -                    fprintf(stderr, "qemu: initrd is too large, cannot support."
>> -                            "(max: %"PRIu32", need %"PRId64")\n",
>> -                            initrd_max, (uint64_t)initrd_size);
>> -                    exit(1);
>> -                }
>> -
>> -                initrd_addr = (initrd_max - initrd_size) & ~4095;
>> -
>> -                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
>> -                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
>> -                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
>> -                                 initrd_size);
>> -            }
>> -
>> -            option_rom[nb_option_roms].bootindex = 0;
>> -            option_rom[nb_option_roms].name = "pvh.bin";
>> -            nb_option_roms++;
>> -
>> -            return;
>> -        }
>> -        protocol = 0;
>> -    }
>> -
>> -    if (protocol < 0x200 || !(header[0x211] & 0x01)) {
>> -        /* Low kernel */
>> -        real_addr    = 0x90000;
>> -        cmdline_addr = 0x9a000 - cmdline_size;
>> -        prot_addr    = 0x10000;
>> -    } else if (protocol < 0x202) {
>> -        /* High but ancient kernel */
>> -        real_addr    = 0x90000;
>> -        cmdline_addr = 0x9a000 - cmdline_size;
>> -        prot_addr    = 0x100000;
>> -    } else {
>> -        /* High and recent kernel */
>> -        real_addr    = 0x10000;
>> -        cmdline_addr = 0x20000;
>> -        prot_addr    = 0x100000;
>> -    }
>> -
>> -#if 0
>> -    fprintf(stderr,
>> -            "qemu: real_addr     = 0x" TARGET_FMT_plx "\n"
>> -            "qemu: cmdline_addr  = 0x" TARGET_FMT_plx "\n"
>> -            "qemu: prot_addr     = 0x" TARGET_FMT_plx "\n",
>> -            real_addr,
>> -            cmdline_addr,
>> -            prot_addr);
>> -#endif
>> -
>> -    /* highest address for loading the initrd */
>> -    if (protocol >= 0x20c &&
>> -        lduw_p(header+0x236) & XLF_CAN_BE_LOADED_ABOVE_4G) {
>> -        /*
>> -         * Linux has supported initrd up to 4 GB for a very long time (2007,
>> -         * long before XLF_CAN_BE_LOADED_ABOVE_4G which was added in 2013),
>> -         * though it only sets initrd_max to 2 GB to "work around bootloader
>> -         * bugs". Luckily, QEMU firmware(which does something like bootloader)
>> -         * has supported this.
>> -         *
>> -         * It's believed that if XLF_CAN_BE_LOADED_ABOVE_4G is set, initrd can
>> -         * be loaded into any address.
>> -         *
>> -         * In addition, initrd_max is uint32_t simply because QEMU doesn't
>> -         * support the 64-bit boot protocol (specifically the ext_ramdisk_image
>> -         * field).
>> -         *
>> -         * Therefore here just limit initrd_max to UINT32_MAX simply as well.
>> -         */
>> -        initrd_max = UINT32_MAX;
>> -    } else if (protocol >= 0x203) {
>> -        initrd_max = ldl_p(header+0x22c);
>> -    } else {
>> -        initrd_max = 0x37ffffff;
>> -    }
>> -
>> -    if (initrd_max >= pcms->below_4g_mem_size - pcmc->acpi_data_size) {
>> -        initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
>> -    }
>> -
>> -    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_ADDR, cmdline_addr);
>> -    fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE, strlen(kernel_cmdline)+1);
>> -    fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
>> -
>> -    if (protocol >= 0x202) {
>> -        stl_p(header+0x228, cmdline_addr);
>> -    } else {
>> -        stw_p(header+0x20, 0xA33F);
>> -        stw_p(header+0x22, cmdline_addr-real_addr);
>> -    }
>> -
>> -    /* handle vga= parameter */
>> -    vmode = strstr(kernel_cmdline, "vga=");
>> -    if (vmode) {
>> -        unsigned int video_mode;
>> -        /* skip "vga=" */
>> -        vmode += 4;
>> -        if (!strncmp(vmode, "normal", 6)) {
>> -            video_mode = 0xffff;
>> -        } else if (!strncmp(vmode, "ext", 3)) {
>> -            video_mode = 0xfffe;
>> -        } else if (!strncmp(vmode, "ask", 3)) {
>> -            video_mode = 0xfffd;
>> -        } else {
>> -            video_mode = strtol(vmode, NULL, 0);
>> -        }
>> -        stw_p(header+0x1fa, video_mode);
>> -    }
>> -
>> -    /* loader type */
>> -    /* High nybble = B reserved for QEMU; low nybble is revision number.
>> -       If this code is substantially changed, you may want to consider
>> -       incrementing the revision. */
>> -    if (protocol >= 0x200) {
>> -        header[0x210] = 0xB0;
>> -    }
>> -    /* heap */
>> -    if (protocol >= 0x201) {
>> -        header[0x211] |= 0x80;	/* CAN_USE_HEAP */
>> -        stw_p(header+0x224, cmdline_addr-real_addr-0x200);
>> -    }
>> -
>> -    /* load initrd */
>> -    if (initrd_filename) {
>> -        GMappedFile *mapped_file;
>> -        gsize initrd_size;
>> -        gchar *initrd_data;
>> -        GError *gerr = NULL;
>> -
>> -        if (protocol < 0x200) {
>> -            fprintf(stderr, "qemu: linux kernel too old to load a ram disk\n");
>> -            exit(1);
>> -        }
>> -
>> -        mapped_file = g_mapped_file_new(initrd_filename, false, &gerr);
>> -        if (!mapped_file) {
>> -            fprintf(stderr, "qemu: error reading initrd %s: %s\n",
>> -                    initrd_filename, gerr->message);
>> -            exit(1);
>> -        }
>> -        pcms->initrd_mapped_file = mapped_file;
>> -
>> -        initrd_data = g_mapped_file_get_contents(mapped_file);
>> -        initrd_size = g_mapped_file_get_length(mapped_file);
>> -        if (initrd_size >= initrd_max) {
>> -            fprintf(stderr, "qemu: initrd is too large, cannot support."
>> -                    "(max: %"PRIu32", need %"PRId64")\n",
>> -                    initrd_max, (uint64_t)initrd_size);
>> -            exit(1);
>> -        }
>> -
>> -        initrd_addr = (initrd_max-initrd_size) & ~4095;
>> -
>> -        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
>> -        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
>> -        fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
>> -
>> -        stl_p(header+0x218, initrd_addr);
>> -        stl_p(header+0x21c, initrd_size);
>> -    }
>> -
>> -    /* load kernel and setup */
>> -    setup_size = header[0x1f1];
>> -    if (setup_size == 0) {
>> -        setup_size = 4;
>> -    }
>> -    setup_size = (setup_size+1)*512;
>> -    if (setup_size > kernel_size) {
>> -        fprintf(stderr, "qemu: invalid kernel header\n");
>> -        exit(1);
>> -    }
>> -    kernel_size -= setup_size;
>> -
>> -    setup  = g_malloc(setup_size);
>> -    kernel = g_malloc(kernel_size);
>> -    fseek(f, 0, SEEK_SET);
>> -    if (fread(setup, 1, setup_size, f) != setup_size) {
>> -        fprintf(stderr, "fread() failed\n");
>> -        exit(1);
>> -    }
>> -    if (fread(kernel, 1, kernel_size, f) != kernel_size) {
>> -        fprintf(stderr, "fread() failed\n");
>> -        exit(1);
>> -    }
>> -    fclose(f);
>> -
>> -    /* append dtb to kernel */
>> -    if (dtb_filename) {
>> -        if (protocol < 0x209) {
>> -            fprintf(stderr, "qemu: Linux kernel too old to load a dtb\n");
>> -            exit(1);
>> -        }
>> -
>> -        dtb_size = get_image_size(dtb_filename);
>> -        if (dtb_size <= 0) {
>> -            fprintf(stderr, "qemu: error reading dtb %s: %s\n",
>> -                    dtb_filename, strerror(errno));
>> -            exit(1);
>> -        }
>> -
>> -        setup_data_offset = QEMU_ALIGN_UP(kernel_size, 16);
>> -        kernel_size = setup_data_offset + sizeof(struct setup_data) + dtb_size;
>> -        kernel = g_realloc(kernel, kernel_size);
>> -
>> -        stq_p(header+0x250, prot_addr + setup_data_offset);
>> -
>> -        setup_data = (struct setup_data *)(kernel + setup_data_offset);
>> -        setup_data->next = 0;
>> -        setup_data->type = cpu_to_le32(SETUP_DTB);
>> -        setup_data->len = cpu_to_le32(dtb_size);
>> -
>> -        load_image_size(dtb_filename, setup_data->data, dtb_size);
>> -    }
>> -
>> -    memcpy(setup, header, MIN(sizeof(header), setup_size));
>> -
>> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, prot_addr);
>> -    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, kernel_size);
>> -    fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, kernel, kernel_size);
>> -
>> -    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_ADDR, real_addr);
>> -    fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, setup_size);
>> -    fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA, setup, setup_size);
>> -
>> -    option_rom[nb_option_roms].bootindex = 0;
>> -    option_rom[nb_option_roms].name = "linuxboot.bin";
>> -    if (pcmc->linuxboot_dma_enabled && fw_cfg_dma_enabled(fw_cfg)) {
>> -        option_rom[nb_option_roms].name = "linuxboot_dma.bin";
>> -    }
>> -    nb_option_roms++;
>> -}
>> -
>>  #define NE2000_NB_MAX 6
>>  
>>  static const int ne2000_io[NE2000_NB_MAX] = { 0x300, 0x320, 0x340, 0x360,
>> @@ -1374,24 +900,6 @@ void pc_acpi_smi_interrupt(void *opaque, int irq, int level)
>>      }
>>  }
>>  
>> -static void x86_cpu_new(PCMachineState *pcms, int64_t apic_id, Error **errp)
>> -{
>> -    Object *cpu = NULL;
>> -    Error *local_err = NULL;
>> -    CPUX86State *env = NULL;
>> -
>> -    cpu = object_new(MACHINE(pcms)->cpu_type);
>> -
>> -    env = &X86_CPU(cpu)->env;
>> -    env->nr_dies = pcms->smp_dies;
>> -
>> -    object_property_set_uint(cpu, apic_id, "apic-id", &local_err);
>> -    object_property_set_bool(cpu, true, "realized", &local_err);
>> -
>> -    object_unref(cpu);
>> -    error_propagate(errp, local_err);
>> -}
>> -
>>  /*
>>   * This function is very similar to smp_parse()
>>   * in hw/core/machine.c but includes CPU die support.
>> @@ -1497,31 +1005,6 @@ void pc_hot_add_cpu(MachineState *ms, const int64_t id, Error **errp)
>>      }
>>  }
>>  
>> -void x86_cpus_init(PCMachineState *pcms)
>> -{
>> -    int i;
>> -    const CPUArchIdList *possible_cpus;
>> -    MachineState *ms = MACHINE(pcms);
>> -    MachineClass *mc = MACHINE_GET_CLASS(pcms);
>> -    PCMachineClass *pcmc = PC_MACHINE_CLASS(mc);
>> -
>> -    x86_cpu_set_default_version(pcmc->default_cpu_version);
>> -
>> -    /* Calculates the limit to CPU APIC ID values
>> -     *
>> -     * Limit for the APIC ID value, so that all
>> -     * CPU APIC IDs are < pcms->apic_id_limit.
>> -     *
>> -     * This is used for FW_CFG_MAX_CPUS. See comments on fw_cfg_arch_create().
>> -     */
>> -    pcms->apic_id_limit = x86_cpu_apic_id_from_index(pcms,
>> -                                                     ms->smp.max_cpus - 1) + 1;
>> -    possible_cpus = mc->possible_cpu_arch_ids(ms);
>> -    for (i = 0; i < ms->smp.cpus; i++) {
>> -        x86_cpu_new(pcms, possible_cpus->cpus[i].arch_id, &error_fatal);
>> -    }
>> -}
>> -
>>  static void rtc_set_cpus_count(ISADevice *rtc, uint16_t cpus_count)
>>  {
>>      if (cpus_count > 0xff) {
>> @@ -2677,69 +2160,6 @@ static void pc_machine_wakeup(MachineState *machine)
>>      cpu_synchronize_all_post_reset();
>>  }
>>  
>> -static CpuInstanceProperties
>> -x86_cpu_index_to_props(MachineState *ms, unsigned cpu_index)
>> -{
>> -    MachineClass *mc = MACHINE_GET_CLASS(ms);
>> -    const CPUArchIdList *possible_cpus = mc->possible_cpu_arch_ids(ms);
>> -
>> -    assert(cpu_index < possible_cpus->len);
>> -    return possible_cpus->cpus[cpu_index].props;
>> -}
>> -
>> -static int64_t x86_get_default_cpu_node_id(const MachineState *ms, int idx)
>> -{
>> -   X86CPUTopoInfo topo;
>> -   PCMachineState *pcms = PC_MACHINE(ms);
>> -
>> -   assert(idx < ms->possible_cpus->len);
>> -   x86_topo_ids_from_apicid(ms->possible_cpus->cpus[idx].arch_id,
>> -                            pcms->smp_dies, ms->smp.cores,
>> -                            ms->smp.threads, &topo);
>> -   return topo.pkg_id % ms->numa_state->num_nodes;
>> -}
>> -
>> -static const CPUArchIdList *x86_possible_cpu_arch_ids(MachineState *ms)
>> -{
>> -    PCMachineState *pcms = PC_MACHINE(ms);
>> -    int i;
>> -    unsigned int max_cpus = ms->smp.max_cpus;
>> -
>> -    if (ms->possible_cpus) {
>> -        /*
>> -         * make sure that max_cpus hasn't changed since the first use, i.e.
>> -         * -smp hasn't been parsed after it
>> -        */
>> -        assert(ms->possible_cpus->len == max_cpus);
>> -        return ms->possible_cpus;
>> -    }
>> -
>> -    ms->possible_cpus = g_malloc0(sizeof(CPUArchIdList) +
>> -                                  sizeof(CPUArchId) * max_cpus);
>> -    ms->possible_cpus->len = max_cpus;
>> -    for (i = 0; i < ms->possible_cpus->len; i++) {
>> -        X86CPUTopoInfo topo;
>> -
>> -        ms->possible_cpus->cpus[i].type = ms->cpu_type;
>> -        ms->possible_cpus->cpus[i].vcpus_count = 1;
>> -        ms->possible_cpus->cpus[i].arch_id = x86_cpu_apic_id_from_index(pcms, i);
>> -        x86_topo_ids_from_apicid(ms->possible_cpus->cpus[i].arch_id,
>> -                                 pcms->smp_dies, ms->smp.cores,
>> -                                 ms->smp.threads, &topo);
>> -        ms->possible_cpus->cpus[i].props.has_socket_id = true;
>> -        ms->possible_cpus->cpus[i].props.socket_id = topo.pkg_id;
>> -        if (pcms->smp_dies > 1) {
>> -            ms->possible_cpus->cpus[i].props.has_die_id = true;
>> -            ms->possible_cpus->cpus[i].props.die_id = topo.die_id;
>> -        }
>> -        ms->possible_cpus->cpus[i].props.has_core_id = true;
>> -        ms->possible_cpus->cpus[i].props.core_id = topo.core_id;
>> -        ms->possible_cpus->cpus[i].props.has_thread_id = true;
>> -        ms->possible_cpus->cpus[i].props.thread_id = topo.smt_id;
>> -    }
>> -    return ms->possible_cpus;
>> -}
>> -
>>  static void x86_nmi(NMIState *n, int cpu_index, Error **errp)
>>  {
>>      /* cpu index isn't used */
>> diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c
>> index de09e076cd..1396451abf 100644
>> --- a/hw/i386/pc_piix.c
>> +++ b/hw/i386/pc_piix.c
>> @@ -27,6 +27,7 @@
>>  
>>  #include "qemu/units.h"
>>  #include "hw/loader.h"
>> +#include "hw/i386/x86.h"
>>  #include "hw/i386/pc.h"
>>  #include "hw/i386/apic.h"
>>  #include "hw/display/ramfb.h"
>> diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
>> index 894989b64e..8920bd8978 100644
>> --- a/hw/i386/pc_q35.c
>> +++ b/hw/i386/pc_q35.c
>> @@ -41,6 +41,7 @@
>>  #include "hw/pci-host/q35.h"
>>  #include "hw/qdev-properties.h"
>>  #include "exec/address-spaces.h"
>> +#include "hw/i386/x86.h"
>>  #include "hw/i386/pc.h"
>>  #include "hw/i386/ich9.h"
>>  #include "hw/i386/amd_iommu.h"
>> diff --git a/hw/i386/pc_sysfw.c b/hw/i386/pc_sysfw.c
>> index 28cb1f63c9..69b79851be 100644
>> --- a/hw/i386/pc_sysfw.c
>> +++ b/hw/i386/pc_sysfw.c
>> @@ -31,6 +31,7 @@
>>  #include "qemu/option.h"
>>  #include "qemu/units.h"
>>  #include "hw/sysbus.h"
>> +#include "hw/i386/x86.h"
>>  #include "hw/i386/pc.h"
>>  #include "hw/loader.h"
>>  #include "hw/qdev-properties.h"
>> @@ -211,59 +212,6 @@ static void pc_system_flash_map(PCMachineState *pcms,
>>      }
>>  }
>>  
>> -static void x86_bios_rom_init(MemoryRegion *rom_memory, bool isapc_ram_fw)
>> -{
>> -    char *filename;
>> -    MemoryRegion *bios, *isa_bios;
>> -    int bios_size, isa_bios_size;
>> -    int ret;
>> -
>> -    /* BIOS load */
>> -    if (bios_name == NULL) {
>> -        bios_name = BIOS_FILENAME;
>> -    }
>> -    filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
>> -    if (filename) {
>> -        bios_size = get_image_size(filename);
>> -    } else {
>> -        bios_size = -1;
>> -    }
>> -    if (bios_size <= 0 ||
>> -        (bios_size % 65536) != 0) {
>> -        goto bios_error;
>> -    }
>> -    bios = g_malloc(sizeof(*bios));
>> -    memory_region_init_ram(bios, NULL, "pc.bios", bios_size, &error_fatal);
>> -    if (!isapc_ram_fw) {
>> -        memory_region_set_readonly(bios, true);
>> -    }
>> -    ret = rom_add_file_fixed(bios_name, (uint32_t)(-bios_size), -1);
>> -    if (ret != 0) {
>> -    bios_error:
>> -        fprintf(stderr, "qemu: could not load PC BIOS '%s'\n", bios_name);
>> -        exit(1);
>> -    }
>> -    g_free(filename);
>> -
>> -    /* map the last 128KB of the BIOS in ISA space */
>> -    isa_bios_size = MIN(bios_size, 128 * KiB);
>> -    isa_bios = g_malloc(sizeof(*isa_bios));
>> -    memory_region_init_alias(isa_bios, NULL, "isa-bios", bios,
>> -                             bios_size - isa_bios_size, isa_bios_size);
>> -    memory_region_add_subregion_overlap(rom_memory,
>> -                                        0x100000 - isa_bios_size,
>> -                                        isa_bios,
>> -                                        1);
>> -    if (!isapc_ram_fw) {
>> -        memory_region_set_readonly(isa_bios, true);
>> -    }
>> -
>> -    /* map all the bios at the top of memory */
>> -    memory_region_add_subregion(rom_memory,
>> -                                (uint32_t)(-bios_size),
>> -                                bios);
>> -}
>> -
>>  void pc_system_firmware_init(PCMachineState *pcms,
>>                               MemoryRegion *rom_memory)
>>  {
>> diff --git a/hw/i386/x86.c b/hw/i386/x86.c
>> new file mode 100644
>> index 0000000000..6807bb8a22
>> --- /dev/null
>> +++ b/hw/i386/x86.c
>> @@ -0,0 +1,684 @@
>> +/*
>> + * Copyright (c) 2003-2004 Fabrice Bellard
>> + * Copyright (c) 2019 Red Hat, Inc.
>> + *
>> + * Permission is hereby granted, free of charge, to any person obtaining a copy
>> + * of this software and associated documentation files (the "Software"), to deal
>> + * in the Software without restriction, including without limitation the rights
>> + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
>> + * copies of the Software, and to permit persons to whom the Software is
>> + * furnished to do so, subject to the following conditions:
>> + *
>> + * The above copyright notice and this permission notice shall be included in
>> + * all copies or substantial portions of the Software.
>> + *
>> + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
>> + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
>> + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
>> + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
>> + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
>> + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
>> + * THE SOFTWARE.
>> + */
>> +#include "qemu/osdep.h"
>> +#include "qemu/error-report.h"
>> +#include "qemu/option.h"
>> +#include "qemu/cutils.h"
>> +#include "qemu/units.h"
>> +#include "qemu-common.h"
>> +#include "qapi/error.h"
>> +#include "qapi/qmp/qerror.h"
>> +#include "qapi/qapi-visit-common.h"
>> +#include "qapi/visitor.h"
>> +#include "sysemu/qtest.h"
>> +#include "sysemu/numa.h"
>> +#include "sysemu/replay.h"
>> +#include "sysemu/sysemu.h"
>> +
>> +#include "hw/i386/x86.h"
>> +#include "hw/i386/pc.h"
>
> Just a note, could we remove the pc.h inclusion here?

You're right, I'll remove it in v7.

Thanks,
Sergio.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 00/10] Introduce the microvm machine type
  2019-10-07 13:44   ` Sergio Lopez
@ 2019-10-07 13:59     ` Philippe Mathieu-Daudé
  2019-10-09 19:21     ` Michael S. Tsirkin
  1 sibling, 0 replies; 31+ messages in thread
From: Philippe Mathieu-Daudé @ 2019-10-07 13:59 UTC (permalink / raw)
  To: Sergio Lopez, Michael S. Tsirkin
  Cc: ehabkost, qemu-devel, kraxel, pbonzini, imammedo, sgarzare, lersek, rth

On 10/7/19 3:44 PM, Sergio Lopez wrote:
> Michael S. Tsirkin <mst@redhat.com> writes:
> 
>> On Fri, Oct 04, 2019 at 11:37:42AM +0200, Sergio Lopez wrote:
>>> Microvm is a machine type inspired by Firecracker and constructed
>>> after the its machine model.
>>>
>>> It's a minimalist machine type without PCI nor ACPI support, designed
>>> for short-lived guests. Microvm also establishes a baseline for
>>> benchmarking and optimizing both QEMU and guest operating systems,
>>> since it is optimized for both boot time and footprint.
>>
>> Pls take a look at patchew warnings and errors.
>> Both coding style issues and test failures need to be
>> addressed somehow I think.
> 
> I've fixed the issue with the test suite, but I'm not sure what to do
> about the coding style errors. Every one of them (except perhaps one at
> xen-hvm.c) comes from code I've moved from pc.c to x86.c. I'd say fixing
> those are outside the scope of the corresponding patches, but please
> correct me if I'm wrong.

What makes reviews easier for me is to split in 2 patches: 1 fixing the 
lines I'm going to move, then the patch that simply move.

You can add checkpatch as a pre-commit hook:
http://blog.vmsplice.net/2011/03/how-to-automatically-run-checkpatchpl.html

In a temporary branch, git mv/commit will call checkpatch and display 
warnings; discard your commit and fix the warnings in the original code.

> On the other hand, I haven't touched MAINTAINERS, because I'm not sure
> about the actual policies that apply while doing so. Should I add the
> new files to it?

The new "hw/virtio/virtio-mmio.h" is covered by the "virtio" section:

F: hw/*/virtio*
F: include/hw/virtio/

Now I think the MicroVM files deserve an entry, probably with Paolo/you 
listed:

- docs/microvm.rst
- hw/i386/microvm.c
- include/hw/i386/microvm.h
- pc-bios/bios-microvm.bin

>>> ---
>>>
>>> Changelog
>>> v6:
>>>   - Some style fixes (Philippe Mathieu-Daudé)
>>>   - Fix a documentation bug stating that LAPIC was in userspace (Paolo
>>>     Bonzini)
>>>   - Update Xen HVM code after X86MachineState introduction (Philippe
>>>     Mathieu-Daudé)
>>>   - Rename header guard from QEMU_VIRTIO_MMIO_H to HW_VIRTIO_MMIO_H
>>>     (Philippe Mathieu-Daudé)
>>>
>>> v5:
>>>   - Drop unneeded "[PATCH v4 2/8] hw/i386: Factorize e820 related
>>>     functions" (Philippe Mathieu-Daudé)
>>>   - Drop unneeded "[PATCH v4 1/8] hw/i386: Factorize PVH related
>>>     functions" (Stefano Garzarella)
>>>   - Split X86MachineState introduction into smaller patches (Philippe
>>>     Mathieu-Daudé)
>>>   - Change option-roms to x-option-roms and kernel-cmdline to
>>>     auto-kernel-cmdline (Paolo Bonzini)
>>>   - Make i8259 PIT and i8254 PIC optional (Paolo Bonzini)
>>>   - Some fixes to the documentation (Paolo Bonzini)
>>>   - Switch documentation format from txt to rst (Peter Maydell)
>>>   - Move NMI interface to X86_MACHINE (Philippe Mathieu-Daudé, Paolo
>>>     Bonzini)
>>>
>>> v4:
>>>   - This is a complete rewrite of the whole patchset, with a focus on
>>>     reusing as much existing code as possible to ease the maintenance burden
>>>     and making the machine type as compatible as possible by default. As
>>>     a result, the number of lines dedicated specifically to microvm is
>>>     383 (code lines measured by "cloc") and, with the default
>>>     configuration, it's now able to boot both PVH ELF images and
>>>     bzImages with either SeaBIOS or qboot.
>>>
>>> v3:
>>>    - Add initrd support (thanks Stefano).
>>>
>>> v2:
>>>    - Drop "[PATCH 1/4] hw/i386: Factorize CPU routine".
>>>    - Simplify machine definition (thanks Eduardo).
>>>    - Remove use of unneeded NUMA-related callbacks (thanks Eduardo).
>>>    - Add a patch to factorize PVH-related functions.
>>>    - Replace use of Linux's Zero Page with PVH (thanks Maran and Paolo).
>>>
>>> ---
>>> Sergio Lopez (10):
>>>    hw/virtio: Factorize virtio-mmio headers
>>>    hw/i386/pc: rename functions shared with non-PC machines
>>>    hw/i386/pc: move shared x86 functions to x86.c and export them
>>>    hw/i386: split PCMachineState deriving X86MachineState from it
>>>    hw/i386: make x86.c independent from PCMachineState
>>>    fw_cfg: add "modify" functions for all types
>>>    hw/intc/apic: reject pic ints if isa_pic == NULL
>>>    roms: add microvm-bios (qboot) as binary and git submodule
>>>    docs/microvm.rst: document the new microvm machine type
>>>    hw/i386: Introduce the microvm machine type
>>>
>>>   docs/microvm.rst                 |  98 ++++
>>>   default-configs/i386-softmmu.mak |   1 +
>>>   include/hw/i386/microvm.h        |  83 ++++
>>>   include/hw/i386/pc.h             |  28 +-
>>>   include/hw/i386/x86.h            |  94 ++++
>>>   include/hw/nvram/fw_cfg.h        |  42 ++
>>>   include/hw/virtio/virtio-mmio.h  |  73 +++
>>>   hw/acpi/cpu_hotplug.c            |  10 +-
>>>   hw/i386/acpi-build.c             |  29 +-
>>>   hw/i386/amd_iommu.c              |   3 +-
>>>   hw/i386/intel_iommu.c            |   3 +-
>>>   hw/i386/microvm.c                | 574 ++++++++++++++++++++++
>>>   hw/i386/pc.c                     | 780 +++---------------------------
>>>   hw/i386/pc_piix.c                |  46 +-
>>>   hw/i386/pc_q35.c                 |  38 +-
>>>   hw/i386/pc_sysfw.c               |  58 +--
>>>   hw/i386/x86.c                    | 790 +++++++++++++++++++++++++++++++
>>>   hw/i386/xen/xen-hvm.c            |  23 +-
>>>   hw/intc/apic.c                   |   2 +-
>>>   hw/intc/ioapic.c                 |   2 +-
>>>   hw/nvram/fw_cfg.c                |  29 ++
>>>   hw/virtio/virtio-mmio.c          |  48 +-
>>>   .gitmodules                      |   3 +
>>>   hw/i386/Kconfig                  |   4 +
>>>   hw/i386/Makefile.objs            |   2 +
>>>   pc-bios/bios-microvm.bin         | Bin 0 -> 65536 bytes
>>>   roms/Makefile                    |   6 +
>>>   roms/qboot                       |   1 +
>>>   28 files changed, 1963 insertions(+), 907 deletions(-)
>>>   create mode 100644 docs/microvm.rst
>>>   create mode 100644 include/hw/i386/microvm.h
>>>   create mode 100644 include/hw/i386/x86.h
>>>   create mode 100644 include/hw/virtio/virtio-mmio.h
>>>   create mode 100644 hw/i386/microvm.c
>>>   create mode 100644 hw/i386/x86.c
>>>   create mode 100755 pc-bios/bios-microvm.bin
>>>   create mode 160000 roms/qboot
>>>
>>> -- 
>>> 2.21.0
> 


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 00/10] Introduce the microvm machine type
  2019-10-04 17:21 ` no-reply
@ 2019-10-08 12:37   ` Paolo Bonzini
  2019-10-08 13:14     ` Sergio Lopez
  0 siblings, 1 reply; 31+ messages in thread
From: Paolo Bonzini @ 2019-10-08 12:37 UTC (permalink / raw)
  To: qemu-devel, slp
  Cc: ehabkost, mst, philmd, kraxel, imammedo, rth, lersek, sgarzare

On 04/10/19 19:21, no-reply@patchew.org wrote:
> qemu-system-x86_64: missing kernel image file name, required by microvm
> Broken pipe
> /tmp/qemu-test/src/tests/libqtest.c:140: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0)
> ERROR - too few tests run (expected 7, got 4)
> make: *** [check-qtest-x86_64] Error 1
> make: *** Waiting for unfinished jobs....
>   TEST    iotest-qcow2: 159
>   TEST    iotest-qcow2: 161
> ---
>     raise CalledProcessError(retcode, cmd)

I think this needs some kind of blacklisting?

Paolo


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 00/10] Introduce the microvm machine type
  2019-10-08 12:37   ` Paolo Bonzini
@ 2019-10-08 13:14     ` Sergio Lopez
  0 siblings, 0 replies; 31+ messages in thread
From: Sergio Lopez @ 2019-10-08 13:14 UTC (permalink / raw)
  To: Paolo Bonzini
  Cc: ehabkost, mst, philmd, qemu-devel, kraxel, imammedo, rth, lersek,
	sgarzare

[-- Attachment #1: Type: text/plain, Size: 778 bytes --]


Paolo Bonzini <pbonzini@redhat.com> writes:

> On 04/10/19 19:21, no-reply@patchew.org wrote:
>> qemu-system-x86_64: missing kernel image file name, required by microvm
>> Broken pipe
>> /tmp/qemu-test/src/tests/libqtest.c:140: kill_qemu() tried to terminate QEMU process but encountered exit status 1 (expected 0)
>> ERROR - too few tests run (expected 7, got 4)
>> make: *** [check-qtest-x86_64] Error 1
>> make: *** Waiting for unfinished jobs....
>>   TEST    iotest-qcow2: 159
>>   TEST    iotest-qcow2: 161
>> ---
>>     raise CalledProcessError(retcode, cmd)
>
> I think this needs some kind of blacklisting?

I solved this issue by allowing a microvm machine to be started without
a kernel filename (now that we always rely on a FW, we can do that ;-)

Cheers,
Sergio.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 00/10] Introduce the microvm machine type
  2019-10-07 13:44   ` Sergio Lopez
  2019-10-07 13:59     ` Philippe Mathieu-Daudé
@ 2019-10-09 19:21     ` Michael S. Tsirkin
  2019-10-09 20:52       ` Eduardo Habkost
  1 sibling, 1 reply; 31+ messages in thread
From: Michael S. Tsirkin @ 2019-10-09 19:21 UTC (permalink / raw)
  To: Sergio Lopez
  Cc: ehabkost, lersek, qemu-devel, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

On Mon, Oct 07, 2019 at 03:44:40PM +0200, Sergio Lopez wrote:
> 
> Michael S. Tsirkin <mst@redhat.com> writes:
> 
> > On Fri, Oct 04, 2019 at 11:37:42AM +0200, Sergio Lopez wrote:
> >> Microvm is a machine type inspired by Firecracker and constructed
> >> after the its machine model.
> >> 
> >> It's a minimalist machine type without PCI nor ACPI support, designed
> >> for short-lived guests. Microvm also establishes a baseline for
> >> benchmarking and optimizing both QEMU and guest operating systems,
> >> since it is optimized for both boot time and footprint.
> >
> > Pls take a look at patchew warnings and errors.
> > Both coding style issues and test failures need to be
> > addressed somehow I think.
> 
> I've fixed the issue with the test suite, but I'm not sure what to do
> about the coding style errors. Every one of them (except perhaps one at
> xen-hvm.c) comes from code I've moved from pc.c to x86.c. I'd say fixing
> those are outside the scope of the corresponding patches, but please
> correct me if I'm wrong.

Yea if you refactor code you have to kick it into shape
at the same time. Can be a separate patch to ease review.

> On the other hand, I haven't touched MAINTAINERS, because I'm not sure
> about the actual policies that apply while doing so. Should I add the
> new files to it?
> 
> Thanks,
> Sergio.
> 
> >> ---
> >> 
> >> Changelog
> >> v6:
> >>  - Some style fixes (Philippe Mathieu-Daudé)
> >>  - Fix a documentation bug stating that LAPIC was in userspace (Paolo
> >>    Bonzini)
> >>  - Update Xen HVM code after X86MachineState introduction (Philippe
> >>    Mathieu-Daudé)
> >>  - Rename header guard from QEMU_VIRTIO_MMIO_H to HW_VIRTIO_MMIO_H
> >>    (Philippe Mathieu-Daudé)
> >> 
> >> v5:
> >>  - Drop unneeded "[PATCH v4 2/8] hw/i386: Factorize e820 related
> >>    functions" (Philippe Mathieu-Daudé)
> >>  - Drop unneeded "[PATCH v4 1/8] hw/i386: Factorize PVH related
> >>    functions" (Stefano Garzarella)
> >>  - Split X86MachineState introduction into smaller patches (Philippe
> >>    Mathieu-Daudé)
> >>  - Change option-roms to x-option-roms and kernel-cmdline to
> >>    auto-kernel-cmdline (Paolo Bonzini)
> >>  - Make i8259 PIT and i8254 PIC optional (Paolo Bonzini)
> >>  - Some fixes to the documentation (Paolo Bonzini)
> >>  - Switch documentation format from txt to rst (Peter Maydell)
> >>  - Move NMI interface to X86_MACHINE (Philippe Mathieu-Daudé, Paolo
> >>    Bonzini)
> >> 
> >> v4:
> >>  - This is a complete rewrite of the whole patchset, with a focus on
> >>    reusing as much existing code as possible to ease the maintenance burden
> >>    and making the machine type as compatible as possible by default. As
> >>    a result, the number of lines dedicated specifically to microvm is
> >>    383 (code lines measured by "cloc") and, with the default
> >>    configuration, it's now able to boot both PVH ELF images and
> >>    bzImages with either SeaBIOS or qboot.
> >> 
> >> v3:
> >>   - Add initrd support (thanks Stefano).
> >> 
> >> v2:
> >>   - Drop "[PATCH 1/4] hw/i386: Factorize CPU routine".
> >>   - Simplify machine definition (thanks Eduardo).
> >>   - Remove use of unneeded NUMA-related callbacks (thanks Eduardo).
> >>   - Add a patch to factorize PVH-related functions.
> >>   - Replace use of Linux's Zero Page with PVH (thanks Maran and Paolo).
> >> 
> >> ---
> >> Sergio Lopez (10):
> >>   hw/virtio: Factorize virtio-mmio headers
> >>   hw/i386/pc: rename functions shared with non-PC machines
> >>   hw/i386/pc: move shared x86 functions to x86.c and export them
> >>   hw/i386: split PCMachineState deriving X86MachineState from it
> >>   hw/i386: make x86.c independent from PCMachineState
> >>   fw_cfg: add "modify" functions for all types
> >>   hw/intc/apic: reject pic ints if isa_pic == NULL
> >>   roms: add microvm-bios (qboot) as binary and git submodule
> >>   docs/microvm.rst: document the new microvm machine type
> >>   hw/i386: Introduce the microvm machine type
> >> 
> >>  docs/microvm.rst                 |  98 ++++
> >>  default-configs/i386-softmmu.mak |   1 +
> >>  include/hw/i386/microvm.h        |  83 ++++
> >>  include/hw/i386/pc.h             |  28 +-
> >>  include/hw/i386/x86.h            |  94 ++++
> >>  include/hw/nvram/fw_cfg.h        |  42 ++
> >>  include/hw/virtio/virtio-mmio.h  |  73 +++
> >>  hw/acpi/cpu_hotplug.c            |  10 +-
> >>  hw/i386/acpi-build.c             |  29 +-
> >>  hw/i386/amd_iommu.c              |   3 +-
> >>  hw/i386/intel_iommu.c            |   3 +-
> >>  hw/i386/microvm.c                | 574 ++++++++++++++++++++++
> >>  hw/i386/pc.c                     | 780 +++---------------------------
> >>  hw/i386/pc_piix.c                |  46 +-
> >>  hw/i386/pc_q35.c                 |  38 +-
> >>  hw/i386/pc_sysfw.c               |  58 +--
> >>  hw/i386/x86.c                    | 790 +++++++++++++++++++++++++++++++
> >>  hw/i386/xen/xen-hvm.c            |  23 +-
> >>  hw/intc/apic.c                   |   2 +-
> >>  hw/intc/ioapic.c                 |   2 +-
> >>  hw/nvram/fw_cfg.c                |  29 ++
> >>  hw/virtio/virtio-mmio.c          |  48 +-
> >>  .gitmodules                      |   3 +
> >>  hw/i386/Kconfig                  |   4 +
> >>  hw/i386/Makefile.objs            |   2 +
> >>  pc-bios/bios-microvm.bin         | Bin 0 -> 65536 bytes
> >>  roms/Makefile                    |   6 +
> >>  roms/qboot                       |   1 +
> >>  28 files changed, 1963 insertions(+), 907 deletions(-)
> >>  create mode 100644 docs/microvm.rst
> >>  create mode 100644 include/hw/i386/microvm.h
> >>  create mode 100644 include/hw/i386/x86.h
> >>  create mode 100644 include/hw/virtio/virtio-mmio.h
> >>  create mode 100644 hw/i386/microvm.c
> >>  create mode 100644 hw/i386/x86.c
> >>  create mode 100755 pc-bios/bios-microvm.bin
> >>  create mode 160000 roms/qboot
> >> 
> >> -- 
> >> 2.21.0
> 




^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [PATCH v6 00/10] Introduce the microvm machine type
  2019-10-09 19:21     ` Michael S. Tsirkin
@ 2019-10-09 20:52       ` Eduardo Habkost
  0 siblings, 0 replies; 31+ messages in thread
From: Eduardo Habkost @ 2019-10-09 20:52 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Sergio Lopez, lersek, qemu-devel, kraxel, pbonzini, imammedo,
	sgarzare, philmd, rth

On Wed, Oct 09, 2019 at 03:21:46PM -0400, Michael S. Tsirkin wrote:
> On Mon, Oct 07, 2019 at 03:44:40PM +0200, Sergio Lopez wrote:
> > 
> > Michael S. Tsirkin <mst@redhat.com> writes:
> > 
> > > On Fri, Oct 04, 2019 at 11:37:42AM +0200, Sergio Lopez wrote:
> > >> Microvm is a machine type inspired by Firecracker and constructed
> > >> after the its machine model.
> > >> 
> > >> It's a minimalist machine type without PCI nor ACPI support, designed
> > >> for short-lived guests. Microvm also establishes a baseline for
> > >> benchmarking and optimizing both QEMU and guest operating systems,
> > >> since it is optimized for both boot time and footprint.
> > >
> > > Pls take a look at patchew warnings and errors.
> > > Both coding style issues and test failures need to be
> > > addressed somehow I think.
> > 
> > I've fixed the issue with the test suite, but I'm not sure what to do
> > about the coding style errors. Every one of them (except perhaps one at
> > xen-hvm.c) comes from code I've moved from pc.c to x86.c. I'd say fixing
> > those are outside the scope of the corresponding patches, but please
> > correct me if I'm wrong.
> 
> Yea if you refactor code you have to kick it into shape
> at the same time. Can be a separate patch to ease review.

I don't think it is reasonable to require code to be 100%
CODING_STYLE-compliant before being moved to a new file.  We can
still encourage cleaning it up, of course, but I don't see the
benefit of making it a requirement.

-- 
Eduardo


^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2019-10-09 21:21 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-10-04  9:37 [PATCH v6 00/10] Introduce the microvm machine type Sergio Lopez
2019-10-04  9:37 ` [PATCH v6 01/10] hw/virtio: Factorize virtio-mmio headers Sergio Lopez
2019-10-04  9:43   ` Philippe Mathieu-Daudé
2019-10-04  9:37 ` [PATCH v6 02/10] hw/i386/pc: rename functions shared with non-PC machines Sergio Lopez
2019-10-04  9:46   ` Philippe Mathieu-Daudé
2019-10-04 11:24   ` Stefano Garzarella
2019-10-04  9:37 ` [PATCH v6 03/10] hw/i386/pc: move shared x86 functions to x86.c and export them Sergio Lopez
2019-10-04  9:46   ` Philippe Mathieu-Daudé
2019-10-04 11:23   ` Stefano Garzarella
2019-10-04 11:36   ` Stefano Garzarella
2019-10-07 13:46     ` Sergio Lopez
2019-10-04  9:37 ` [PATCH v6 04/10] hw/i386: split PCMachineState deriving X86MachineState from it Sergio Lopez
2019-10-04  9:49   ` Philippe Mathieu-Daudé
2019-10-04  9:37 ` [PATCH v6 05/10] hw/i386: make x86.c independent from PCMachineState Sergio Lopez
2019-10-04  9:51   ` Philippe Mathieu-Daudé
2019-10-04  9:37 ` [PATCH v6 06/10] fw_cfg: add "modify" functions for all types Sergio Lopez
2019-10-04  9:37 ` [PATCH v6 07/10] hw/intc/apic: reject pic ints if isa_pic == NULL Sergio Lopez
2019-10-04  9:37 ` [PATCH v6 08/10] roms: add microvm-bios (qboot) as binary and git submodule Sergio Lopez
2019-10-04 11:50   ` Stefano Garzarella
2019-10-04  9:37 ` [PATCH v6 09/10] docs/microvm.rst: document the new microvm machine type Sergio Lopez
2019-10-04  9:37 ` [PATCH v6 10/10] hw/i386: Introduce the " Sergio Lopez
2019-10-04 13:57 ` [PATCH v6 00/10] " Michael S. Tsirkin
2019-10-04 17:21 ` no-reply
2019-10-08 12:37   ` Paolo Bonzini
2019-10-08 13:14     ` Sergio Lopez
2019-10-04 17:22 ` no-reply
2019-10-05 22:08 ` Michael S. Tsirkin
2019-10-07 13:44   ` Sergio Lopez
2019-10-07 13:59     ` Philippe Mathieu-Daudé
2019-10-09 19:21     ` Michael S. Tsirkin
2019-10-09 20:52       ` Eduardo Habkost

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.