All of lore.kernel.org
 help / color / mirror / Atom feed
* [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot
@ 2018-12-21 20:03 ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2018-12-21 20:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, ehabkost, rth, xen-devel, sgarzare, mst, stefanha,
	maran.wilson, george.kennedy, boris.ostrovsky, liam.merwick

For certain applications it is desirable to rapidly boot a KVM virtual
machine. In cases where legacy hardware and software support within the
guest is not needed, QEMU should be able to boot directly into the
uncompressed Linux kernel binary with minimal firmware involvement.

There already exists an ABI to allow this for Xen PVH guests and the ABI
is supported by Linux and FreeBSD:

   https://xenbits.xen.org/docs/unstable/misc/pvh.html

Details on the Linux changes (v9 staged for 4.21): https://lkml.org/lkml/2018/12/14/1330
qboot pull request: https://github.com/bonzini/qboot/pull/17 

This patch series provides QEMU support to read the ELF header of an
uncompressed kernel binary and get the 32-bit PVH kernel entry point
from an ELF Note.  In load_linux() a call is made to load_elfboot()
so see if the header matches that of an uncompressed kernel binary (ELF)
and if so, loads the binary and determines the kernel entry address
from an ELF Note in the binary.  Then qboot does futher initialisation
of the guest (e820, etc.) and jumps to the kernel entry address and
boots the guest.

changes v1 -> v2
- Based on feedback from Stefan Hajnoczi
- The reading of the PVH entry point is now done in a single pass during
  elf_load() which results in Patch2 in v1 being split into Patches 1&2 in v2
  and considerably reworked.
- Patch1 adds a new optional function pointer to parse the ELF note type
  (the type is passed in via the existing translate_opaque arg - the
  function already had 11 args so I didn't want to add more than one new arg).
- Patch2 adds a function to elf_ops.h to find an ELF note
  matching a specific type 
- Patch3 just has a line added to the commit message to state that the Xen
  repo is the canonical location
- Patch4 (that does the PVH boot) is mainly equivalent to Patch3 in v1
  just minor load_elfboot() changes and the addition of a
  read_pvh_start_addr() helper function for load_elf()


Usіng the method/scripts documented by the NEMU team at

   https://github.com/intel/nemu/wiki/Measuring-Boot-Latency
   https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00200.html

below are some timings measured (vmlinux and bzImage from the same build)
Time to get to kernel start is almost halved (95ṁs -> 48ms)

QEMU + qboot + vmlinux (PVH + 4.20-rc4)
 qemu_init_end: 41.550521
 fw_start: 41.667139 (+0.116618)
 fw_do_boot: 47.448495 (+5.781356)
 linux_startup_64: 47.720785 (+0.27229)
 linux_start_kernel: 48.399541 (+0.678756)
 linux_start_user: 296.952056 (+248.552515)

QEMU + qboot + bzImage:
 qemu_init_end: 29.209276
 fw_start: 29.317342 (+0.108066)
 linux_start_boot: 36.679362 (+7.36202)
 linux_startup_64: 94.531349 (+57.851987)
 linux_start_kernel: 94.900913 (+0.369564)
 linux_start_user: 401.060971 (+306.160058)

QEMU + bzImage:
 qemu_init_end: 30.424430
 linux_startup_64: 893.770334 (+863.345904)
 linux_start_kernel: 894.17049 (+0.400156)
 linux_start_user: 1208.679768 (+314.509278)


Liam Merwick (4):
  elf: Add optional function ptr to load_elf() to parse ELF notes
  elf-ops.h: Add get_elf_note_type()
  pvh: Add x86/HVM direct boot ABI header file
  pvh: Boot uncompressed kernel using direct boot ABI

 hw/alpha/dp264.c               |   4 +-
 hw/arm/armv7m.c                |   3 +-
 hw/arm/boot.c                  |   2 +-
 hw/core/generic-loader.c       |   2 +-
 hw/core/loader.c               |  24 ++++---
 hw/cris/boot.c                 |   3 +-
 hw/hppa/machine.c              |   6 +-
 hw/i386/multiboot.c            |   2 +-
 hw/i386/pc.c                   | 131 +++++++++++++++++++++++++++++++++++-
 hw/lm32/lm32_boards.c          |   6 +-
 hw/lm32/milkymist.c            |   3 +-
 hw/m68k/an5206.c               |   2 +-
 hw/m68k/mcf5208.c              |   2 +-
 hw/microblaze/boot.c           |   7 +-
 hw/mips/mips_fulong2e.c        |   5 +-
 hw/mips/mips_malta.c           |   5 +-
 hw/mips/mips_mipssim.c         |   5 +-
 hw/mips/mips_r4k.c             |   5 +-
 hw/moxie/moxiesim.c            |   2 +-
 hw/nios2/boot.c                |   7 +-
 hw/openrisc/openrisc_sim.c     |   2 +-
 hw/pci-host/prep.c             |   2 +-
 hw/ppc/e500.c                  |   3 +-
 hw/ppc/mac_newworld.c          |   5 +-
 hw/ppc/mac_oldworld.c          |   5 +-
 hw/ppc/ppc440_bamboo.c         |   2 +-
 hw/ppc/sam460ex.c              |   3 +-
 hw/ppc/spapr.c                 |   7 +-
 hw/ppc/virtex_ml507.c          |   2 +-
 hw/riscv/sifive_e.c            |   2 +-
 hw/riscv/sifive_u.c            |   2 +-
 hw/riscv/spike.c               |   2 +-
 hw/riscv/virt.c                |   2 +-
 hw/s390x/ipl.c                 |   9 ++-
 hw/sparc/leon3.c               |   3 +-
 hw/sparc/sun4m.c               |   6 +-
 hw/sparc64/sun4u.c             |   4 +-
 hw/tricore/tricore_testboard.c |   2 +-
 hw/xtensa/sim.c                |  12 ++--
 hw/xtensa/xtfpga.c             |   2 +-
 include/elf.h                  |  10 +++
 include/hw/elf_ops.h           |  72 ++++++++++++++++++++
 include/hw/loader.h            |   9 ++-
 include/hw/xen/start_info.h    | 146 +++++++++++++++++++++++++++++++++++++++++
 44 files changed, 469 insertions(+), 71 deletions(-)
 create mode 100644 include/hw/xen/start_info.h

-- 
1.8.3.1

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC v2 0/4] QEMU changes to do PVH boot
@ 2018-12-21 20:03 ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2018-12-21 20:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: liam.merwick, ehabkost, mst, maran.wilson, george.kennedy,
	stefanha, xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare

For certain applications it is desirable to rapidly boot a KVM virtual
machine. In cases where legacy hardware and software support within the
guest is not needed, QEMU should be able to boot directly into the
uncompressed Linux kernel binary with minimal firmware involvement.

There already exists an ABI to allow this for Xen PVH guests and the ABI
is supported by Linux and FreeBSD:

   https://xenbits.xen.org/docs/unstable/misc/pvh.html

Details on the Linux changes (v9 staged for 4.21): https://lkml.org/lkml/2018/12/14/1330
qboot pull request: https://github.com/bonzini/qboot/pull/17 

This patch series provides QEMU support to read the ELF header of an
uncompressed kernel binary and get the 32-bit PVH kernel entry point
from an ELF Note.  In load_linux() a call is made to load_elfboot()
so see if the header matches that of an uncompressed kernel binary (ELF)
and if so, loads the binary and determines the kernel entry address
from an ELF Note in the binary.  Then qboot does futher initialisation
of the guest (e820, etc.) and jumps to the kernel entry address and
boots the guest.

changes v1 -> v2
- Based on feedback from Stefan Hajnoczi
- The reading of the PVH entry point is now done in a single pass during
  elf_load() which results in Patch2 in v1 being split into Patches 1&2 in v2
  and considerably reworked.
- Patch1 adds a new optional function pointer to parse the ELF note type
  (the type is passed in via the existing translate_opaque arg - the
  function already had 11 args so I didn't want to add more than one new arg).
- Patch2 adds a function to elf_ops.h to find an ELF note
  matching a specific type 
- Patch3 just has a line added to the commit message to state that the Xen
  repo is the canonical location
- Patch4 (that does the PVH boot) is mainly equivalent to Patch3 in v1
  just minor load_elfboot() changes and the addition of a
  read_pvh_start_addr() helper function for load_elf()


Usіng the method/scripts documented by the NEMU team at

   https://github.com/intel/nemu/wiki/Measuring-Boot-Latency
   https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00200.html

below are some timings measured (vmlinux and bzImage from the same build)
Time to get to kernel start is almost halved (95ṁs -> 48ms)

QEMU + qboot + vmlinux (PVH + 4.20-rc4)
 qemu_init_end: 41.550521
 fw_start: 41.667139 (+0.116618)
 fw_do_boot: 47.448495 (+5.781356)
 linux_startup_64: 47.720785 (+0.27229)
 linux_start_kernel: 48.399541 (+0.678756)
 linux_start_user: 296.952056 (+248.552515)

QEMU + qboot + bzImage:
 qemu_init_end: 29.209276
 fw_start: 29.317342 (+0.108066)
 linux_start_boot: 36.679362 (+7.36202)
 linux_startup_64: 94.531349 (+57.851987)
 linux_start_kernel: 94.900913 (+0.369564)
 linux_start_user: 401.060971 (+306.160058)

QEMU + bzImage:
 qemu_init_end: 30.424430
 linux_startup_64: 893.770334 (+863.345904)
 linux_start_kernel: 894.17049 (+0.400156)
 linux_start_user: 1208.679768 (+314.509278)


Liam Merwick (4):
  elf: Add optional function ptr to load_elf() to parse ELF notes
  elf-ops.h: Add get_elf_note_type()
  pvh: Add x86/HVM direct boot ABI header file
  pvh: Boot uncompressed kernel using direct boot ABI

 hw/alpha/dp264.c               |   4 +-
 hw/arm/armv7m.c                |   3 +-
 hw/arm/boot.c                  |   2 +-
 hw/core/generic-loader.c       |   2 +-
 hw/core/loader.c               |  24 ++++---
 hw/cris/boot.c                 |   3 +-
 hw/hppa/machine.c              |   6 +-
 hw/i386/multiboot.c            |   2 +-
 hw/i386/pc.c                   | 131 +++++++++++++++++++++++++++++++++++-
 hw/lm32/lm32_boards.c          |   6 +-
 hw/lm32/milkymist.c            |   3 +-
 hw/m68k/an5206.c               |   2 +-
 hw/m68k/mcf5208.c              |   2 +-
 hw/microblaze/boot.c           |   7 +-
 hw/mips/mips_fulong2e.c        |   5 +-
 hw/mips/mips_malta.c           |   5 +-
 hw/mips/mips_mipssim.c         |   5 +-
 hw/mips/mips_r4k.c             |   5 +-
 hw/moxie/moxiesim.c            |   2 +-
 hw/nios2/boot.c                |   7 +-
 hw/openrisc/openrisc_sim.c     |   2 +-
 hw/pci-host/prep.c             |   2 +-
 hw/ppc/e500.c                  |   3 +-
 hw/ppc/mac_newworld.c          |   5 +-
 hw/ppc/mac_oldworld.c          |   5 +-
 hw/ppc/ppc440_bamboo.c         |   2 +-
 hw/ppc/sam460ex.c              |   3 +-
 hw/ppc/spapr.c                 |   7 +-
 hw/ppc/virtex_ml507.c          |   2 +-
 hw/riscv/sifive_e.c            |   2 +-
 hw/riscv/sifive_u.c            |   2 +-
 hw/riscv/spike.c               |   2 +-
 hw/riscv/virt.c                |   2 +-
 hw/s390x/ipl.c                 |   9 ++-
 hw/sparc/leon3.c               |   3 +-
 hw/sparc/sun4m.c               |   6 +-
 hw/sparc64/sun4u.c             |   4 +-
 hw/tricore/tricore_testboard.c |   2 +-
 hw/xtensa/sim.c                |  12 ++--
 hw/xtensa/xtfpga.c             |   2 +-
 include/elf.h                  |  10 +++
 include/hw/elf_ops.h           |  72 ++++++++++++++++++++
 include/hw/loader.h            |   9 ++-
 include/hw/xen/start_info.h    | 146 +++++++++++++++++++++++++++++++++++++++++
 44 files changed, 469 insertions(+), 71 deletions(-)
 create mode 100644 include/hw/xen/start_info.h

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC v2 1/4] elf: Add optional function ptr to load_elf() to parse ELF notes
  2018-12-21 20:03 ` Liam Merwick
@ 2018-12-21 20:03   ` Liam Merwick
  -1 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2018-12-21 20:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, ehabkost, rth, xen-devel, sgarzare, mst, stefanha,
	maran.wilson, george.kennedy, boris.ostrovsky, liam.merwick

This patch adds an optional function pointer, 'elf_note_fn', to
load_elf() which causes load_elf() to additionally parse any
ELF program headers of type PT_NOTE and check to see if the ELF
Note is of the type specified by the 'translate_opaque' arg.
If a matching ELF Note is found then the specfied function pointer
is called to process the ELF note.

Passing a NULL function pointer results in ELF Notes being skipped.

The first consumer of this functionality is the PVHboot support
which needs to read the XEN_ELFNOTE_PHYS32_ENTRY ELF Note while
loading the uncompressed kernel binary in order to discover the
boot entry address for the x86/HVM direct boot ABI.

Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
---
 hw/alpha/dp264.c               |  4 ++--
 hw/arm/armv7m.c                |  3 ++-
 hw/arm/boot.c                  |  2 +-
 hw/core/generic-loader.c       |  2 +-
 hw/core/loader.c               | 24 ++++++++++++++++--------
 hw/cris/boot.c                 |  3 ++-
 hw/hppa/machine.c              |  6 +++---
 hw/i386/multiboot.c            |  2 +-
 hw/lm32/lm32_boards.c          |  6 ++++--
 hw/lm32/milkymist.c            |  3 ++-
 hw/m68k/an5206.c               |  2 +-
 hw/m68k/mcf5208.c              |  2 +-
 hw/microblaze/boot.c           |  7 ++++---
 hw/mips/mips_fulong2e.c        |  5 +++--
 hw/mips/mips_malta.c           |  5 +++--
 hw/mips/mips_mipssim.c         |  5 +++--
 hw/mips/mips_r4k.c             |  5 +++--
 hw/moxie/moxiesim.c            |  2 +-
 hw/nios2/boot.c                |  7 ++++---
 hw/openrisc/openrisc_sim.c     |  2 +-
 hw/pci-host/prep.c             |  2 +-
 hw/ppc/e500.c                  |  3 ++-
 hw/ppc/mac_newworld.c          |  5 +++--
 hw/ppc/mac_oldworld.c          |  5 +++--
 hw/ppc/ppc440_bamboo.c         |  2 +-
 hw/ppc/sam460ex.c              |  3 ++-
 hw/ppc/spapr.c                 |  7 ++++---
 hw/ppc/virtex_ml507.c          |  2 +-
 hw/riscv/sifive_e.c            |  2 +-
 hw/riscv/sifive_u.c            |  2 +-
 hw/riscv/spike.c               |  2 +-
 hw/riscv/virt.c                |  2 +-
 hw/s390x/ipl.c                 |  9 ++++++---
 hw/sparc/leon3.c               |  3 ++-
 hw/sparc/sun4m.c               |  6 ++++--
 hw/sparc64/sun4u.c             |  4 ++--
 hw/tricore/tricore_testboard.c |  2 +-
 hw/xtensa/sim.c                | 12 ++++++++----
 hw/xtensa/xtfpga.c             |  2 +-
 include/hw/elf_ops.h           | 23 +++++++++++++++++++++++
 include/hw/loader.h            |  9 ++++++++-
 41 files changed, 134 insertions(+), 70 deletions(-)

diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c
index dd62f2a4050c..0347eb897c8a 100644
--- a/hw/alpha/dp264.c
+++ b/hw/alpha/dp264.c
@@ -114,7 +114,7 @@ static void clipper_init(MachineState *machine)
         error_report("no palcode provided");
         exit(1);
     }
-    size = load_elf(palcode_filename, cpu_alpha_superpage_to_phys,
+    size = load_elf(palcode_filename, NULL, cpu_alpha_superpage_to_phys,
                     NULL, &palcode_entry, &palcode_low, &palcode_high,
                     0, EM_ALPHA, 0, 0);
     if (size < 0) {
@@ -133,7 +133,7 @@ static void clipper_init(MachineState *machine)
     if (kernel_filename) {
         uint64_t param_offset;
 
-        size = load_elf(kernel_filename, cpu_alpha_superpage_to_phys,
+        size = load_elf(kernel_filename, NULL, cpu_alpha_superpage_to_phys,
                         NULL, &kernel_entry, &kernel_low, &kernel_high,
                         0, EM_ALPHA, 0, 0);
         if (size < 0) {
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index 4bf9131b81e4..a4d528537eb4 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -298,7 +298,8 @@ void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size)
     as = cpu_get_address_space(cs, asidx);
 
     if (kernel_filename) {
-        image_size = load_elf_as(kernel_filename, NULL, NULL, &entry, &lowaddr,
+        image_size = load_elf_as(kernel_filename, NULL, NULL, NULL,
+                                 &entry, &lowaddr,
                                  NULL, big_endian, EM_ARM, 1, 0, as);
         if (image_size < 0) {
             image_size = load_image_targphys_as(kernel_filename, 0,
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 94fce128028c..2b59379be6af 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -884,7 +884,7 @@ static int64_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
         }
     }
 
-    ret = load_elf_as(info->kernel_filename, NULL, NULL,
+    ret = load_elf_as(info->kernel_filename, NULL, NULL, NULL,
                       pentry, lowaddr, highaddr, big_endian, elf_machine,
                       1, data_swab, as);
     if (ret <= 0) {
diff --git a/hw/core/generic-loader.c b/hw/core/generic-loader.c
index fbae05fb3b64..3695dd439cd0 100644
--- a/hw/core/generic-loader.c
+++ b/hw/core/generic-loader.c
@@ -136,7 +136,7 @@ static void generic_loader_realize(DeviceState *dev, Error **errp)
         AddressSpace *as = s->cpu ? s->cpu->as :  NULL;
 
         if (!s->force_raw) {
-            size = load_elf_as(s->file, NULL, NULL, &entry, NULL, NULL,
+            size = load_elf_as(s->file, NULL, NULL, NULL, &entry, NULL, NULL,
                                big_endian, 0, 0, 0, as);
 
             if (size < 0) {
diff --git a/hw/core/loader.c b/hw/core/loader.c
index fa41842280a0..eefa74c218a8 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -396,37 +396,42 @@ fail:
 }
 
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
-int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
+int load_elf(const char *filename,
+             uint64_t (*elf_note_fn)(void *, void *, bool),
+             uint64_t (*translate_fn)(void *, uint64_t),
              void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
              uint64_t *highaddr, int big_endian, int elf_machine,
              int clear_lsb, int data_swab)
 {
-    return load_elf_as(filename, translate_fn, translate_opaque, pentry,
-                       lowaddr, highaddr, big_endian, elf_machine, clear_lsb,
-                       data_swab, NULL);
+    return load_elf_as(filename, elf_note_fn, translate_fn, translate_opaque,
+                       pentry, lowaddr, highaddr, big_endian, elf_machine,
+                       clear_lsb, data_swab, NULL);
 }
 
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
 int load_elf_as(const char *filename,
+                uint64_t (*elf_note_fn)(void *, void *, bool),
                 uint64_t (*translate_fn)(void *, uint64_t),
                 void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
                 uint64_t *highaddr, int big_endian, int elf_machine,
                 int clear_lsb, int data_swab, AddressSpace *as)
 {
-    return load_elf_ram(filename, translate_fn, translate_opaque,
+    return load_elf_ram(filename, elf_note_fn, translate_fn, translate_opaque,
                         pentry, lowaddr, highaddr, big_endian, elf_machine,
                         clear_lsb, data_swab, as, true);
 }
 
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
 int load_elf_ram(const char *filename,
+                 uint64_t (*elf_note_fn)(void *, void *, bool),
                  uint64_t (*translate_fn)(void *, uint64_t),
                  void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
                  uint64_t *highaddr, int big_endian, int elf_machine,
                  int clear_lsb, int data_swab, AddressSpace *as,
                  bool load_rom)
 {
-    return load_elf_ram_sym(filename, translate_fn, translate_opaque,
+    return load_elf_ram_sym(filename, elf_note_fn,
+                            translate_fn, translate_opaque,
                             pentry, lowaddr, highaddr, big_endian,
                             elf_machine, clear_lsb, data_swab, as,
                             load_rom, NULL);
@@ -434,6 +439,7 @@ int load_elf_ram(const char *filename,
 
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
 int load_elf_ram_sym(const char *filename,
+                     uint64_t (*elf_note_fn)(void *, void *, bool),
                      uint64_t (*translate_fn)(void *, uint64_t),
                      void *translate_opaque, uint64_t *pentry,
                      uint64_t *lowaddr, uint64_t *highaddr, int big_endian,
@@ -476,11 +482,13 @@ int load_elf_ram_sym(const char *filename,
 
     lseek(fd, 0, SEEK_SET);
     if (e_ident[EI_CLASS] == ELFCLASS64) {
-        ret = load_elf64(filename, fd, translate_fn, translate_opaque, must_swab,
+        ret = load_elf64(filename, fd, elf_note_fn,
+                         translate_fn, translate_opaque, must_swab,
                          pentry, lowaddr, highaddr, elf_machine, clear_lsb,
                          data_swab, as, load_rom, sym_cb);
     } else {
-        ret = load_elf32(filename, fd, translate_fn, translate_opaque, must_swab,
+        ret = load_elf32(filename, fd, elf_note_fn,
+                         translate_fn, translate_opaque, must_swab,
                          pentry, lowaddr, highaddr, elf_machine, clear_lsb,
                          data_swab, as, load_rom, sym_cb);
     }
diff --git a/hw/cris/boot.c b/hw/cris/boot.c
index f896ed7f8635..95cba2151b79 100644
--- a/hw/cris/boot.c
+++ b/hw/cris/boot.c
@@ -75,7 +75,8 @@ void cris_load_image(CRISCPU *cpu, struct cris_load_info *li)
     env->load_info = li;
     /* Boots a kernel elf binary, os/linux-2.6/vmlinux from the axis 
        devboard SDK.  */
-    image_size = load_elf(li->image_filename, translate_kernel_address, NULL,
+    image_size = load_elf(li->image_filename, NULL,
+                          translate_kernel_address, NULL,
                           &entry, NULL, &high, 0, EM_CRIS, 0, 0);
     li->entry = entry;
     if (image_size < 0) {
diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index ac6dd7f6abdc..d1b1d3caa40d 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -135,8 +135,8 @@ static void machine_hppa_init(MachineState *machine)
         exit(1);
     }
 
-    size = load_elf(firmware_filename, NULL,
-                    NULL, &firmware_entry, &firmware_low, &firmware_high,
+    size = load_elf(firmware_filename, NULL, NULL, NULL,
+                    &firmware_entry, &firmware_low, &firmware_high,
                     true, EM_PARISC, 0, 0);
 
     /* Unfortunately, load_elf sign-extends reading elf32.  */
@@ -165,7 +165,7 @@ static void machine_hppa_init(MachineState *machine)
 
     /* Load kernel */
     if (kernel_filename) {
-        size = load_elf(kernel_filename, &cpu_hppa_to_phys,
+        size = load_elf(kernel_filename, NULL, &cpu_hppa_to_phys,
                         NULL, &kernel_entry, &kernel_low, &kernel_high,
                         true, EM_PARISC, 0, 0);
 
diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c
index 62340687e8ed..a3e33fbe5e18 100644
--- a/hw/i386/multiboot.c
+++ b/hw/i386/multiboot.c
@@ -199,7 +199,7 @@ int load_multiboot(FWCfgState *fw_cfg,
             exit(1);
         }
 
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL, &elf_entry,
                                &elf_low, &elf_high, 0, I386_ELF_MACHINE,
                                0, 0);
         if (kernel_size < 0) {
diff --git a/hw/lm32/lm32_boards.c b/hw/lm32/lm32_boards.c
index fd8eccca14d9..05157f8eab76 100644
--- a/hw/lm32/lm32_boards.c
+++ b/hw/lm32/lm32_boards.c
@@ -138,7 +138,8 @@ static void lm32_evr_init(MachineState *machine)
         uint64_t entry;
         int kernel_size;
 
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &entry, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
+                               &entry, NULL, NULL,
                                1, EM_LATTICEMICO32, 0, 0);
         reset_info->bootstrap_pc = entry;
 
@@ -231,7 +232,8 @@ static void lm32_uclinux_init(MachineState *machine)
         uint64_t entry;
         int kernel_size;
 
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &entry, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
+                               &entry, NULL, NULL,
                                1, EM_LATTICEMICO32, 0, 0);
         reset_info->bootstrap_pc = entry;
 
diff --git a/hw/lm32/milkymist.c b/hw/lm32/milkymist.c
index 63c6894c9559..7b0046b3e821 100644
--- a/hw/lm32/milkymist.c
+++ b/hw/lm32/milkymist.c
@@ -175,7 +175,8 @@ milkymist_init(MachineState *machine)
         uint64_t entry;
 
         /* Boots a kernel elf binary.  */
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &entry, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
+                               &entry, NULL, NULL,
                                1, EM_LATTICEMICO32, 0, 0);
         reset_info->bootstrap_pc = entry;
 
diff --git a/hw/m68k/an5206.c b/hw/m68k/an5206.c
index 5e067ea1c356..06e380325885 100644
--- a/hw/m68k/an5206.c
+++ b/hw/m68k/an5206.c
@@ -66,7 +66,7 @@ static void an5206_init(MachineState *machine)
         exit(1);
     }
 
-    kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+    kernel_size = load_elf(kernel_filename, NULL, NULL, NULL, &elf_entry,
                            NULL, NULL, 1, EM_68K, 0, 0);
     entry = elf_entry;
     if (kernel_size < 0) {
diff --git a/hw/m68k/mcf5208.c b/hw/m68k/mcf5208.c
index 0f2245dd8177..8531e07e5b57 100644
--- a/hw/m68k/mcf5208.c
+++ b/hw/m68k/mcf5208.c
@@ -294,7 +294,7 @@ static void mcf5208evb_init(MachineState *machine)
         exit(1);
     }
 
-    kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+    kernel_size = load_elf(kernel_filename, NULL, NULL, NULL, &elf_entry,
                            NULL, NULL, 1, EM_68K, 0, 0);
     entry = elf_entry;
     if (kernel_size < 0) {
diff --git a/hw/microblaze/boot.c b/hw/microblaze/boot.c
index 35bfeda7aa71..54c646810aa5 100644
--- a/hw/microblaze/boot.c
+++ b/hw/microblaze/boot.c
@@ -142,13 +142,14 @@ void microblaze_load_kernel(MicroBlazeCPU *cpu, hwaddr ddr_base,
 #endif
 
         /* Boots a kernel elf binary.  */
-        kernel_size = load_elf(kernel_filename, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
                                &entry, &low, &high,
                                big_endian, EM_MICROBLAZE, 0, 0);
         base32 = entry;
         if (base32 == 0xc0000000) {
-            kernel_size = load_elf(kernel_filename, translate_kernel_address,
-                                   NULL, &entry, NULL, NULL,
+            kernel_size = load_elf(kernel_filename, NULL,
+                                   translate_kernel_address, NULL,
+                                   &entry, NULL, NULL,
                                    big_endian, EM_MICROBLAZE, 0, 0);
         }
         /* Always boot into physical ram.  */
diff --git a/hw/mips/mips_fulong2e.c b/hw/mips/mips_fulong2e.c
index 2fbba32c4819..1f24a9fc2e13 100644
--- a/hw/mips/mips_fulong2e.c
+++ b/hw/mips/mips_fulong2e.c
@@ -111,8 +111,9 @@ static int64_t load_kernel (CPUMIPSState *env)
     uint32_t *prom_buf;
     long prom_size;
 
-    kernel_size = load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys,
-                           NULL, (uint64_t *)&kernel_entry,
+    kernel_size = load_elf(loaderparams.kernel_filename, NULL,
+                           cpu_mips_kseg0_to_phys, NULL,
+                           (uint64_t *)&kernel_entry,
                            (uint64_t *)&kernel_low, (uint64_t *)&kernel_high,
                            0, EM_MIPS, 1, 0);
     if (kernel_size < 0) {
diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index c1cf0fe12e95..74667766c277 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -1010,8 +1010,9 @@ static int64_t load_kernel (void)
     big_endian = 0;
 #endif
 
-    kernel_size = load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys,
-                           NULL, (uint64_t *)&kernel_entry, NULL,
+    kernel_size = load_elf(loaderparams.kernel_filename, NULL,
+                           cpu_mips_kseg0_to_phys, NULL,
+                           (uint64_t *)&kernel_entry, NULL,
                            (uint64_t *)&kernel_high, big_endian, EM_MIPS, 1, 0);
     if (kernel_size < 0) {
         error_report("could not load kernel '%s': %s",
diff --git a/hw/mips/mips_mipssim.c b/hw/mips/mips_mipssim.c
index f665752a2fc6..824abda65748 100644
--- a/hw/mips/mips_mipssim.c
+++ b/hw/mips/mips_mipssim.c
@@ -69,8 +69,9 @@ static int64_t load_kernel(void)
     big_endian = 0;
 #endif
 
-    kernel_size = load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys,
-                           NULL, (uint64_t *)&entry, NULL,
+    kernel_size = load_elf(loaderparams.kernel_filename, NULL,
+                           cpu_mips_kseg0_to_phys, NULL,
+                           (uint64_t *)&entry, NULL,
                            (uint64_t *)&kernel_high, big_endian,
                            EM_MIPS, 1, 0);
     if (kernel_size >= 0) {
diff --git a/hw/mips/mips_r4k.c b/hw/mips/mips_r4k.c
index 3e852e98cf9c..29eae06e9ad1 100644
--- a/hw/mips/mips_r4k.c
+++ b/hw/mips/mips_r4k.c
@@ -92,8 +92,9 @@ static int64_t load_kernel(void)
 #else
     big_endian = 0;
 #endif
-    kernel_size = load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys,
-                           NULL, (uint64_t *)&entry, NULL,
+    kernel_size = load_elf(loaderparams.kernel_filename, NULL,
+                           cpu_mips_kseg0_to_phys, NULL,
+                           (uint64_t *)&entry, NULL,
                            (uint64_t *)&kernel_high, big_endian,
                            EM_MIPS, 1, 0);
     if (kernel_size >= 0) {
diff --git a/hw/moxie/moxiesim.c b/hw/moxie/moxiesim.c
index 4b0ce09c5ee5..db11c00677de 100644
--- a/hw/moxie/moxiesim.c
+++ b/hw/moxie/moxiesim.c
@@ -58,7 +58,7 @@ static void load_kernel(MoxieCPU *cpu, LoaderParams *loader_params)
     long kernel_size;
     ram_addr_t initrd_offset;
 
-    kernel_size = load_elf(loader_params->kernel_filename,  NULL, NULL,
+    kernel_size = load_elf(loader_params->kernel_filename,  NULL, NULL, NULL,
                            &entry, &kernel_low, &kernel_high, 1, EM_MOXIE,
                            0, 0);
 
diff --git a/hw/nios2/boot.c b/hw/nios2/boot.c
index 4bb5b601d3af..bc6a68cfa60a 100644
--- a/hw/nios2/boot.c
+++ b/hw/nios2/boot.c
@@ -146,13 +146,14 @@ void nios2_load_kernel(Nios2CPU *cpu, hwaddr ddr_base,
 #endif
 
         /* Boots a kernel elf binary. */
-        kernel_size = load_elf(kernel_filename, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
                                &entry, &low, &high,
                                big_endian, EM_ALTERA_NIOS2, 0, 0);
         base32 = entry;
         if (base32 == 0xc0000000) {
-            kernel_size = load_elf(kernel_filename, translate_kernel_address,
-                                   NULL, &entry, NULL, NULL,
+            kernel_size = load_elf(kernel_filename, NULL,
+                                   translate_kernel_address, NULL,
+                                   &entry, NULL, NULL,
                                    big_endian, EM_ALTERA_NIOS2, 0, 0);
         }
 
diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index a495a84a41d9..7d3b734d24fb 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -96,7 +96,7 @@ static void openrisc_load_kernel(ram_addr_t ram_size,
     hwaddr entry;
 
     if (kernel_filename && !qtest_enabled()) {
-        kernel_size = load_elf(kernel_filename, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
                                &elf_entry, NULL, NULL, 1, EM_OPENRISC,
                                1, 0);
         entry = elf_entry;
diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index b1b6b16badb3..8b9e1fd0d343 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -331,7 +331,7 @@ static void raven_realize(PCIDevice *d, Error **errp)
         filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, s->bios_name);
         if (filename) {
             if (s->elf_machine != EM_NONE) {
-                bios_size = load_elf(filename, NULL, NULL, NULL,
+                bios_size = load_elf(filename, NULL, NULL, NULL, NULL,
                                      NULL, NULL, 1, s->elf_machine, 0, 0);
             }
             if (bios_size < 0) {
diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index e6747fce282a..28c77b693f59 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -990,7 +990,8 @@ void ppce500_init(MachineState *machine)
 
     filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, payload_name);
 
-    payload_size = load_elf(filename, NULL, NULL, &bios_entry, &loadaddr, NULL,
+    payload_size = load_elf(filename, NULL, NULL, NULL,
+                            &bios_entry, &loadaddr, NULL,
                             1, PPC_ELF_MACHINE, 0, 0);
     if (payload_size < 0) {
         /*
diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 7e45afae7c55..f5a68be6319f 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -165,7 +165,7 @@ static void ppc_core99_init(MachineState *machine)
 
     /* Load OpenBIOS (ELF) */
     if (filename) {
-        bios_size = load_elf(filename, NULL, NULL, NULL,
+        bios_size = load_elf(filename, NULL, NULL, NULL, NULL,
                              NULL, NULL, 1, PPC_ELF_MACHINE, 0, 0);
 
         g_free(filename);
@@ -188,7 +188,8 @@ static void ppc_core99_init(MachineState *machine)
 #endif
         kernel_base = KERNEL_LOAD_ADDR;
 
-        kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
+        kernel_size = load_elf(kernel_filename, NULL,
+                               translate_kernel_address, NULL,
                                NULL, &lowaddr, NULL, 1, PPC_ELF_MACHINE,
                                0, 0);
         if (kernel_size < 0)
diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index 817f70e52cf3..c28dde1992f4 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -140,7 +140,7 @@ static void ppc_heathrow_init(MachineState *machine)
 
     /* Load OpenBIOS (ELF) */
     if (filename) {
-        bios_size = load_elf(filename, 0, NULL, NULL, NULL, NULL,
+        bios_size = load_elf(filename, NULL, 0, NULL, NULL, NULL, NULL,
                              1, PPC_ELF_MACHINE, 0, 0);
         g_free(filename);
     } else {
@@ -161,7 +161,8 @@ static void ppc_heathrow_init(MachineState *machine)
         bswap_needed = 0;
 #endif
         kernel_base = KERNEL_LOAD_ADDR;
-        kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
+        kernel_size = load_elf(kernel_filename, NULL,
+                               translate_kernel_address, NULL,
                                NULL, &lowaddr, NULL, 1, PPC_ELF_MACHINE,
                                0, 0);
         if (kernel_size < 0)
diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index f5720f979e42..9bb71fbdcd4f 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -257,7 +257,7 @@ static void bamboo_init(MachineState *machine)
         success = load_uimage(kernel_filename, &entry, &loadaddr, NULL,
                               NULL, NULL);
         if (success < 0) {
-            success = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+            success = load_elf(kernel_filename, NULL, NULL, NULL, &elf_entry,
                                &elf_lowaddr, NULL, 1, PPC_ELF_MACHINE,
                                0, 0);
             entry = elf_entry;
diff --git a/hw/ppc/sam460ex.c b/hw/ppc/sam460ex.c
index 5aac58f36ee1..4dfd47766aa6 100644
--- a/hw/ppc/sam460ex.c
+++ b/hw/ppc/sam460ex.c
@@ -557,7 +557,8 @@ static void sam460ex_init(MachineState *machine)
         if (success < 0) {
             uint64_t elf_entry, elf_lowaddr;
 
-            success = load_elf(machine->kernel_filename, NULL, NULL, &elf_entry,
+            success = load_elf(machine->kernel_filename, NULL,
+                               NULL, NULL, &elf_entry,
                                &elf_lowaddr, NULL, 1, PPC_ELF_MACHINE, 0, 0);
             entry = elf_entry;
             loadaddr = elf_lowaddr;
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 55be0f56cbe2..6b4f1da197b1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2777,11 +2777,12 @@ static void spapr_machine_init(MachineState *machine)
     if (kernel_filename) {
         uint64_t lowaddr = 0;
 
-        spapr->kernel_size = load_elf(kernel_filename, translate_kernel_address,
-                                      NULL, NULL, &lowaddr, NULL, 1,
+        spapr->kernel_size = load_elf(kernel_filename, NULL,
+                                      translate_kernel_address, NULL,
+                                      NULL, &lowaddr, NULL, 1,
                                       PPC_ELF_MACHINE, 0, 0);
         if (spapr->kernel_size == ELF_LOAD_WRONG_ENDIAN) {
-            spapr->kernel_size = load_elf(kernel_filename,
+            spapr->kernel_size = load_elf(kernel_filename, NULL,
                                           translate_kernel_address, NULL, NULL,
                                           &lowaddr, NULL, 0, PPC_ELF_MACHINE,
                                           0, 0);
diff --git a/hw/ppc/virtex_ml507.c b/hw/ppc/virtex_ml507.c
index ee9b4b449086..9b383dc3d551 100644
--- a/hw/ppc/virtex_ml507.c
+++ b/hw/ppc/virtex_ml507.c
@@ -258,7 +258,7 @@ static void virtex_init(MachineState *machine)
         hwaddr boot_offset;
 
         /* Boots a kernel elf binary.  */
-        kernel_size = load_elf(kernel_filename, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
                                &entry, &low, &high, 1, PPC_ELF_MACHINE,
                                0, 0);
         boot_info.bootstrap_pc = entry & 0x00ffffff;
diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
index cb513cc3bb50..242773232e22 100644
--- a/hw/riscv/sifive_e.c
+++ b/hw/riscv/sifive_e.c
@@ -78,7 +78,7 @@ static uint64_t load_kernel(const char *kernel_filename)
 {
     uint64_t kernel_entry, kernel_high;
 
-    if (load_elf(kernel_filename, NULL, NULL,
+    if (load_elf(kernel_filename, NULL, NULL, NULL,
                  &kernel_entry, NULL, &kernel_high,
                  0, EM_RISCV, 1, 0) < 0) {
         error_report("could not load kernel '%s'", kernel_filename);
diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index ef07df244241..21fbb732a74f 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -69,7 +69,7 @@ static uint64_t load_kernel(const char *kernel_filename)
 {
     uint64_t kernel_entry, kernel_high;
 
-    if (load_elf(kernel_filename, NULL, NULL,
+    if (load_elf(kernel_filename, NULL, NULL, NULL,
                  &kernel_entry, NULL, &kernel_high,
                  0, EM_RISCV, 1, 0) < 0) {
         error_report("could not load kernel '%s'", kernel_filename);
diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index 268df04c3c7d..c66ffc50cc74 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -57,7 +57,7 @@ static uint64_t load_kernel(const char *kernel_filename)
 {
     uint64_t kernel_entry, kernel_high;
 
-    if (load_elf_ram_sym(kernel_filename, NULL, NULL,
+    if (load_elf_ram_sym(kernel_filename, NULL, NULL, NULL,
             &kernel_entry, NULL, &kernel_high, 0, EM_RISCV, 1, 0,
             NULL, true, htif_symbol_callback) < 0) {
         error_report("could not load kernel '%s'", kernel_filename);
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 2b38f890702c..dcfbb99e4a16 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -61,7 +61,7 @@ static uint64_t load_kernel(const char *kernel_filename)
 {
     uint64_t kernel_entry, kernel_high;
 
-    if (load_elf(kernel_filename, NULL, NULL,
+    if (load_elf(kernel_filename, NULL, NULL, NULL,
                  &kernel_entry, NULL, &kernel_high,
                  0, EM_RISCV, 1, 0) < 0) {
         error_report("could not load kernel '%s'", kernel_filename);
diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 21f64ad26aae..896888bf8f00 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -131,7 +131,8 @@ static void s390_ipl_realize(DeviceState *dev, Error **errp)
             goto error;
         }
 
-        bios_size = load_elf(bios_filename, bios_translate_addr, &fwbase,
+        bios_size = load_elf(bios_filename, NULL,
+                             bios_translate_addr, &fwbase,
                              &ipl->bios_start_addr, NULL, NULL, 1,
                              EM_S390, 0, 0);
         if (bios_size > 0) {
@@ -155,7 +156,8 @@ static void s390_ipl_realize(DeviceState *dev, Error **errp)
     }
 
     if (ipl->kernel) {
-        kernel_size = load_elf(ipl->kernel, NULL, NULL, &pentry, NULL,
+        kernel_size = load_elf(ipl->kernel, NULL, NULL, NULL,
+                               &pentry, NULL,
                                NULL, 1, EM_S390, 0, 0);
         if (kernel_size < 0) {
             kernel_size = load_image_targphys(ipl->kernel, 0, ram_size);
@@ -436,7 +438,8 @@ static int load_netboot_image(Error **errp)
         goto unref_mr;
     }
 
-    img_size = load_elf_ram(netboot_filename, NULL, NULL, &ipl->start_addr,
+    img_size = load_elf_ram(netboot_filename, NULL, NULL, NULL,
+                            &ipl->start_addr,
                             NULL, NULL, 1, EM_S390, 0, 0, NULL, false);
 
     if (img_size < 0) {
diff --git a/hw/sparc/leon3.c b/hw/sparc/leon3.c
index fa98ab81776c..774639af3393 100644
--- a/hw/sparc/leon3.c
+++ b/hw/sparc/leon3.c
@@ -190,7 +190,8 @@ static void leon3_generic_hw_init(MachineState *machine)
         long     kernel_size;
         uint64_t entry;
 
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &entry, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
+                               &entry, NULL, NULL,
                                1 /* big endian */, EM_SPARC, 0, 0);
         if (kernel_size < 0) {
             error_report("could not load kernel '%s'", kernel_filename);
diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
index 639906cca30c..c6a10cc8e1a2 100644
--- a/hw/sparc/sun4m.c
+++ b/hw/sparc/sun4m.c
@@ -243,7 +243,8 @@ static unsigned long sun4m_load_kernel(const char *kernel_filename,
 #else
         bswap_needed = 0;
 #endif
-        kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
+        kernel_size = load_elf(kernel_filename, NULL,
+                               translate_kernel_address, NULL,
                                NULL, NULL, NULL, 1, EM_SPARC, 0, 0);
         if (kernel_size < 0)
             kernel_size = load_aout(kernel_filename, KERNEL_LOAD_ADDR,
@@ -693,7 +694,8 @@ static void prom_init(hwaddr addr, const char *bios_name)
     }
     filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
     if (filename) {
-        ret = load_elf(filename, translate_prom_address, &addr, NULL,
+        ret = load_elf(filename, NULL,
+                       translate_prom_address, &addr, NULL,
                        NULL, NULL, 1, EM_SPARC, 0, 0);
         if (ret < 0 || ret > PROM_SIZE_MAX) {
             ret = load_image_targphys(filename, addr, PROM_SIZE_MAX);
diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
index f76b19e4e93b..b9bd4be5d512 100644
--- a/hw/sparc64/sun4u.c
+++ b/hw/sparc64/sun4u.c
@@ -153,7 +153,7 @@ static uint64_t sun4u_load_kernel(const char *kernel_filename,
 #else
         bswap_needed = 0;
 #endif
-        kernel_size = load_elf(kernel_filename, NULL, NULL, kernel_entry,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL, kernel_entry,
                                kernel_addr, &kernel_top, 1, EM_SPARCV9, 0, 0);
         if (kernel_size < 0) {
             *kernel_addr = KERNEL_LOAD_ADDR;
@@ -411,7 +411,7 @@ static void prom_init(hwaddr addr, const char *bios_name)
     }
     filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
     if (filename) {
-        ret = load_elf(filename, translate_prom_address, &addr,
+        ret = load_elf(filename, NULL, translate_prom_address, &addr,
                        NULL, NULL, NULL, 1, EM_SPARCV9, 0, 0);
         if (ret < 0 || ret > PROM_SIZE_MAX) {
             ret = load_image_targphys(filename, addr, PROM_SIZE_MAX);
diff --git a/hw/tricore/tricore_testboard.c b/hw/tricore/tricore_testboard.c
index a58096f05e72..003592af27a6 100644
--- a/hw/tricore/tricore_testboard.c
+++ b/hw/tricore/tricore_testboard.c
@@ -45,7 +45,7 @@ static void tricore_load_kernel(CPUTriCoreState *env)
     long kernel_size;
 
     kernel_size = load_elf(tricoretb_binfo.kernel_filename, NULL,
-                           NULL, &entry, NULL,
+                           NULL, NULL, &entry, NULL,
                            NULL, 0,
                            EM_TRICORE, 1, 0);
     if (kernel_size <= 0) {
diff --git a/hw/xtensa/sim.c b/hw/xtensa/sim.c
index b6ccb3cd4ae9..12c7437398e8 100644
--- a/hw/xtensa/sim.c
+++ b/hw/xtensa/sim.c
@@ -97,11 +97,15 @@ static void xtensa_sim_init(MachineState *machine)
         uint64_t elf_entry;
         uint64_t elf_lowaddr;
 #ifdef TARGET_WORDS_BIGENDIAN
-        int success = load_elf(kernel_filename, translate_phys_addr, cpu,
-                &elf_entry, &elf_lowaddr, NULL, 1, EM_XTENSA, 0, 0);
+        int success = load_elf(kernel_filename, NULL,
+                               translate_phys_addr, cpu,
+                               &elf_entry, &elf_lowaddr,
+                               NULL, 1, EM_XTENSA, 0, 0);
 #else
-        int success = load_elf(kernel_filename, translate_phys_addr, cpu,
-                &elf_entry, &elf_lowaddr, NULL, 0, EM_XTENSA, 0, 0);
+        int success = load_elf(kernel_filename, NULL,
+                               translate_phys_addr, cpu,
+                               &elf_entry, &elf_lowaddr,
+                               NULL, 0, EM_XTENSA, 0, 0);
 #endif
         if (success > 0) {
             env->pc = elf_entry;
diff --git a/hw/xtensa/xtfpga.c b/hw/xtensa/xtfpga.c
index 21094319a659..cec5bda3fc73 100644
--- a/hw/xtensa/xtfpga.c
+++ b/hw/xtensa/xtfpga.c
@@ -378,7 +378,7 @@ static void xtfpga_init(const XtfpgaBoardDesc *board, MachineState *machine)
 
         uint64_t elf_entry;
         uint64_t elf_lowaddr;
-        int success = load_elf(kernel_filename, translate_phys_addr, cpu,
+        int success = load_elf(kernel_filename, NULL, translate_phys_addr, cpu,
                 &elf_entry, &elf_lowaddr, NULL, be, EM_XTENSA, 0, 0);
         if (success > 0) {
             entry_point = elf_entry;
diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
index 74679ff8da3a..37d20a3800c1 100644
--- a/include/hw/elf_ops.h
+++ b/include/hw/elf_ops.h
@@ -266,6 +266,7 @@ fail:
 }
 
 static int glue(load_elf, SZ)(const char *name, int fd,
+                              uint64_t (*elf_note_fn)(void *, void *, bool),
                               uint64_t (*translate_fn)(void *, uint64_t),
                               void *translate_opaque,
                               int must_swab, uint64_t *pentry,
@@ -496,8 +497,30 @@ static int glue(load_elf, SZ)(const char *name, int fd,
                 high = addr + mem_size;
 
             data = NULL;
+
+        } else if (ph->p_type == PT_NOTE && elf_note_fn) {
+            struct elf_note *nhdr = NULL;
+
+            file_size = ph->p_filesz; /* Size of the range of ELF notes */
+            data = g_malloc0(file_size);
+            if (ph->p_filesz > 0) {
+                if (lseek(fd, ph->p_offset, SEEK_SET) < 0) {
+                    goto fail;
+                }
+                if (read(fd, data, file_size) != file_size) {
+                    goto fail;
+                }
+            }
+
+            if (nhdr != NULL) {
+                bool is64 =
+                    sizeof(struct elf_note) == sizeof(struct elf64_note);
+                elf_note_fn((void *)nhdr, (void *)&ph->p_align, is64);
+            }
+            g_free(data);
         }
     }
+
     g_free(phdr);
     if (lowaddr)
         *lowaddr = (uint64_t)(elf_sword)low;
diff --git a/include/hw/loader.h b/include/hw/loader.h
index 0a0ad808ea39..130e73c32b21 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -93,6 +93,8 @@ const char *load_elf_strerror(int error);
 
 /** load_elf_ram_sym:
  * @filename: Path of ELF file
+ * @elf_note_fn: optional function to parse ELF Note type
+ *               passed via @translate_opaque
  * @translate_fn: optional function to translate load addresses
  * @translate_opaque: opaque data passed to @translate_fn
  * @pentry: Populated with program entry point. Ignored if NULL.
@@ -125,6 +127,7 @@ typedef void (*symbol_fn_t)(const char *st_name, int st_info,
                             uint64_t st_value, uint64_t st_size);
 
 int load_elf_ram_sym(const char *filename,
+                     uint64_t (*elf_note_fn)(void *, void *, bool),
                      uint64_t (*translate_fn)(void *, uint64_t),
                      void *translate_opaque, uint64_t *pentry,
                      uint64_t *lowaddr, uint64_t *highaddr, int big_endian,
@@ -136,6 +139,7 @@ int load_elf_ram_sym(const char *filename,
  * symbol callback function
  */
 int load_elf_ram(const char *filename,
+                 uint64_t (*elf_note_fn)(void *, void *, bool),
                  uint64_t (*translate_fn)(void *, uint64_t),
                  void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
                  uint64_t *highaddr, int big_endian, int elf_machine,
@@ -146,6 +150,7 @@ int load_elf_ram(const char *filename,
  * Same as load_elf_ram(), but always loads the elf as ROM
  */
 int load_elf_as(const char *filename,
+                uint64_t (*elf_note_fn)(void *, void *, bool),
                 uint64_t (*translate_fn)(void *, uint64_t),
                 void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
                 uint64_t *highaddr, int big_endian, int elf_machine,
@@ -155,7 +160,9 @@ int load_elf_as(const char *filename,
  * Same as load_elf_as(), but doesn't allow the caller to specify an
  * AddressSpace.
  */
-int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
+int load_elf(const char *filename,
+             uint64_t (*elf_note_fn)(void *, void *, bool),
+             uint64_t (*translate_fn)(void *, uint64_t),
              void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
              uint64_t *highaddr, int big_endian, int elf_machine,
              int clear_lsb, int data_swab);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v2 1/4] elf: Add optional function ptr to load_elf() to parse ELF notes
@ 2018-12-21 20:03   ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2018-12-21 20:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: liam.merwick, ehabkost, mst, maran.wilson, george.kennedy,
	stefanha, xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare

This patch adds an optional function pointer, 'elf_note_fn', to
load_elf() which causes load_elf() to additionally parse any
ELF program headers of type PT_NOTE and check to see if the ELF
Note is of the type specified by the 'translate_opaque' arg.
If a matching ELF Note is found then the specfied function pointer
is called to process the ELF note.

Passing a NULL function pointer results in ELF Notes being skipped.

The first consumer of this functionality is the PVHboot support
which needs to read the XEN_ELFNOTE_PHYS32_ENTRY ELF Note while
loading the uncompressed kernel binary in order to discover the
boot entry address for the x86/HVM direct boot ABI.

Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
---
 hw/alpha/dp264.c               |  4 ++--
 hw/arm/armv7m.c                |  3 ++-
 hw/arm/boot.c                  |  2 +-
 hw/core/generic-loader.c       |  2 +-
 hw/core/loader.c               | 24 ++++++++++++++++--------
 hw/cris/boot.c                 |  3 ++-
 hw/hppa/machine.c              |  6 +++---
 hw/i386/multiboot.c            |  2 +-
 hw/lm32/lm32_boards.c          |  6 ++++--
 hw/lm32/milkymist.c            |  3 ++-
 hw/m68k/an5206.c               |  2 +-
 hw/m68k/mcf5208.c              |  2 +-
 hw/microblaze/boot.c           |  7 ++++---
 hw/mips/mips_fulong2e.c        |  5 +++--
 hw/mips/mips_malta.c           |  5 +++--
 hw/mips/mips_mipssim.c         |  5 +++--
 hw/mips/mips_r4k.c             |  5 +++--
 hw/moxie/moxiesim.c            |  2 +-
 hw/nios2/boot.c                |  7 ++++---
 hw/openrisc/openrisc_sim.c     |  2 +-
 hw/pci-host/prep.c             |  2 +-
 hw/ppc/e500.c                  |  3 ++-
 hw/ppc/mac_newworld.c          |  5 +++--
 hw/ppc/mac_oldworld.c          |  5 +++--
 hw/ppc/ppc440_bamboo.c         |  2 +-
 hw/ppc/sam460ex.c              |  3 ++-
 hw/ppc/spapr.c                 |  7 ++++---
 hw/ppc/virtex_ml507.c          |  2 +-
 hw/riscv/sifive_e.c            |  2 +-
 hw/riscv/sifive_u.c            |  2 +-
 hw/riscv/spike.c               |  2 +-
 hw/riscv/virt.c                |  2 +-
 hw/s390x/ipl.c                 |  9 ++++++---
 hw/sparc/leon3.c               |  3 ++-
 hw/sparc/sun4m.c               |  6 ++++--
 hw/sparc64/sun4u.c             |  4 ++--
 hw/tricore/tricore_testboard.c |  2 +-
 hw/xtensa/sim.c                | 12 ++++++++----
 hw/xtensa/xtfpga.c             |  2 +-
 include/hw/elf_ops.h           | 23 +++++++++++++++++++++++
 include/hw/loader.h            |  9 ++++++++-
 41 files changed, 134 insertions(+), 70 deletions(-)

diff --git a/hw/alpha/dp264.c b/hw/alpha/dp264.c
index dd62f2a4050c..0347eb897c8a 100644
--- a/hw/alpha/dp264.c
+++ b/hw/alpha/dp264.c
@@ -114,7 +114,7 @@ static void clipper_init(MachineState *machine)
         error_report("no palcode provided");
         exit(1);
     }
-    size = load_elf(palcode_filename, cpu_alpha_superpage_to_phys,
+    size = load_elf(palcode_filename, NULL, cpu_alpha_superpage_to_phys,
                     NULL, &palcode_entry, &palcode_low, &palcode_high,
                     0, EM_ALPHA, 0, 0);
     if (size < 0) {
@@ -133,7 +133,7 @@ static void clipper_init(MachineState *machine)
     if (kernel_filename) {
         uint64_t param_offset;
 
-        size = load_elf(kernel_filename, cpu_alpha_superpage_to_phys,
+        size = load_elf(kernel_filename, NULL, cpu_alpha_superpage_to_phys,
                         NULL, &kernel_entry, &kernel_low, &kernel_high,
                         0, EM_ALPHA, 0, 0);
         if (size < 0) {
diff --git a/hw/arm/armv7m.c b/hw/arm/armv7m.c
index 4bf9131b81e4..a4d528537eb4 100644
--- a/hw/arm/armv7m.c
+++ b/hw/arm/armv7m.c
@@ -298,7 +298,8 @@ void armv7m_load_kernel(ARMCPU *cpu, const char *kernel_filename, int mem_size)
     as = cpu_get_address_space(cs, asidx);
 
     if (kernel_filename) {
-        image_size = load_elf_as(kernel_filename, NULL, NULL, &entry, &lowaddr,
+        image_size = load_elf_as(kernel_filename, NULL, NULL, NULL,
+                                 &entry, &lowaddr,
                                  NULL, big_endian, EM_ARM, 1, 0, as);
         if (image_size < 0) {
             image_size = load_image_targphys_as(kernel_filename, 0,
diff --git a/hw/arm/boot.c b/hw/arm/boot.c
index 94fce128028c..2b59379be6af 100644
--- a/hw/arm/boot.c
+++ b/hw/arm/boot.c
@@ -884,7 +884,7 @@ static int64_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
         }
     }
 
-    ret = load_elf_as(info->kernel_filename, NULL, NULL,
+    ret = load_elf_as(info->kernel_filename, NULL, NULL, NULL,
                       pentry, lowaddr, highaddr, big_endian, elf_machine,
                       1, data_swab, as);
     if (ret <= 0) {
diff --git a/hw/core/generic-loader.c b/hw/core/generic-loader.c
index fbae05fb3b64..3695dd439cd0 100644
--- a/hw/core/generic-loader.c
+++ b/hw/core/generic-loader.c
@@ -136,7 +136,7 @@ static void generic_loader_realize(DeviceState *dev, Error **errp)
         AddressSpace *as = s->cpu ? s->cpu->as :  NULL;
 
         if (!s->force_raw) {
-            size = load_elf_as(s->file, NULL, NULL, &entry, NULL, NULL,
+            size = load_elf_as(s->file, NULL, NULL, NULL, &entry, NULL, NULL,
                                big_endian, 0, 0, 0, as);
 
             if (size < 0) {
diff --git a/hw/core/loader.c b/hw/core/loader.c
index fa41842280a0..eefa74c218a8 100644
--- a/hw/core/loader.c
+++ b/hw/core/loader.c
@@ -396,37 +396,42 @@ fail:
 }
 
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
-int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
+int load_elf(const char *filename,
+             uint64_t (*elf_note_fn)(void *, void *, bool),
+             uint64_t (*translate_fn)(void *, uint64_t),
              void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
              uint64_t *highaddr, int big_endian, int elf_machine,
              int clear_lsb, int data_swab)
 {
-    return load_elf_as(filename, translate_fn, translate_opaque, pentry,
-                       lowaddr, highaddr, big_endian, elf_machine, clear_lsb,
-                       data_swab, NULL);
+    return load_elf_as(filename, elf_note_fn, translate_fn, translate_opaque,
+                       pentry, lowaddr, highaddr, big_endian, elf_machine,
+                       clear_lsb, data_swab, NULL);
 }
 
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
 int load_elf_as(const char *filename,
+                uint64_t (*elf_note_fn)(void *, void *, bool),
                 uint64_t (*translate_fn)(void *, uint64_t),
                 void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
                 uint64_t *highaddr, int big_endian, int elf_machine,
                 int clear_lsb, int data_swab, AddressSpace *as)
 {
-    return load_elf_ram(filename, translate_fn, translate_opaque,
+    return load_elf_ram(filename, elf_note_fn, translate_fn, translate_opaque,
                         pentry, lowaddr, highaddr, big_endian, elf_machine,
                         clear_lsb, data_swab, as, true);
 }
 
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
 int load_elf_ram(const char *filename,
+                 uint64_t (*elf_note_fn)(void *, void *, bool),
                  uint64_t (*translate_fn)(void *, uint64_t),
                  void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
                  uint64_t *highaddr, int big_endian, int elf_machine,
                  int clear_lsb, int data_swab, AddressSpace *as,
                  bool load_rom)
 {
-    return load_elf_ram_sym(filename, translate_fn, translate_opaque,
+    return load_elf_ram_sym(filename, elf_note_fn,
+                            translate_fn, translate_opaque,
                             pentry, lowaddr, highaddr, big_endian,
                             elf_machine, clear_lsb, data_swab, as,
                             load_rom, NULL);
@@ -434,6 +439,7 @@ int load_elf_ram(const char *filename,
 
 /* return < 0 if error, otherwise the number of bytes loaded in memory */
 int load_elf_ram_sym(const char *filename,
+                     uint64_t (*elf_note_fn)(void *, void *, bool),
                      uint64_t (*translate_fn)(void *, uint64_t),
                      void *translate_opaque, uint64_t *pentry,
                      uint64_t *lowaddr, uint64_t *highaddr, int big_endian,
@@ -476,11 +482,13 @@ int load_elf_ram_sym(const char *filename,
 
     lseek(fd, 0, SEEK_SET);
     if (e_ident[EI_CLASS] == ELFCLASS64) {
-        ret = load_elf64(filename, fd, translate_fn, translate_opaque, must_swab,
+        ret = load_elf64(filename, fd, elf_note_fn,
+                         translate_fn, translate_opaque, must_swab,
                          pentry, lowaddr, highaddr, elf_machine, clear_lsb,
                          data_swab, as, load_rom, sym_cb);
     } else {
-        ret = load_elf32(filename, fd, translate_fn, translate_opaque, must_swab,
+        ret = load_elf32(filename, fd, elf_note_fn,
+                         translate_fn, translate_opaque, must_swab,
                          pentry, lowaddr, highaddr, elf_machine, clear_lsb,
                          data_swab, as, load_rom, sym_cb);
     }
diff --git a/hw/cris/boot.c b/hw/cris/boot.c
index f896ed7f8635..95cba2151b79 100644
--- a/hw/cris/boot.c
+++ b/hw/cris/boot.c
@@ -75,7 +75,8 @@ void cris_load_image(CRISCPU *cpu, struct cris_load_info *li)
     env->load_info = li;
     /* Boots a kernel elf binary, os/linux-2.6/vmlinux from the axis 
        devboard SDK.  */
-    image_size = load_elf(li->image_filename, translate_kernel_address, NULL,
+    image_size = load_elf(li->image_filename, NULL,
+                          translate_kernel_address, NULL,
                           &entry, NULL, &high, 0, EM_CRIS, 0, 0);
     li->entry = entry;
     if (image_size < 0) {
diff --git a/hw/hppa/machine.c b/hw/hppa/machine.c
index ac6dd7f6abdc..d1b1d3caa40d 100644
--- a/hw/hppa/machine.c
+++ b/hw/hppa/machine.c
@@ -135,8 +135,8 @@ static void machine_hppa_init(MachineState *machine)
         exit(1);
     }
 
-    size = load_elf(firmware_filename, NULL,
-                    NULL, &firmware_entry, &firmware_low, &firmware_high,
+    size = load_elf(firmware_filename, NULL, NULL, NULL,
+                    &firmware_entry, &firmware_low, &firmware_high,
                     true, EM_PARISC, 0, 0);
 
     /* Unfortunately, load_elf sign-extends reading elf32.  */
@@ -165,7 +165,7 @@ static void machine_hppa_init(MachineState *machine)
 
     /* Load kernel */
     if (kernel_filename) {
-        size = load_elf(kernel_filename, &cpu_hppa_to_phys,
+        size = load_elf(kernel_filename, NULL, &cpu_hppa_to_phys,
                         NULL, &kernel_entry, &kernel_low, &kernel_high,
                         true, EM_PARISC, 0, 0);
 
diff --git a/hw/i386/multiboot.c b/hw/i386/multiboot.c
index 62340687e8ed..a3e33fbe5e18 100644
--- a/hw/i386/multiboot.c
+++ b/hw/i386/multiboot.c
@@ -199,7 +199,7 @@ int load_multiboot(FWCfgState *fw_cfg,
             exit(1);
         }
 
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL, &elf_entry,
                                &elf_low, &elf_high, 0, I386_ELF_MACHINE,
                                0, 0);
         if (kernel_size < 0) {
diff --git a/hw/lm32/lm32_boards.c b/hw/lm32/lm32_boards.c
index fd8eccca14d9..05157f8eab76 100644
--- a/hw/lm32/lm32_boards.c
+++ b/hw/lm32/lm32_boards.c
@@ -138,7 +138,8 @@ static void lm32_evr_init(MachineState *machine)
         uint64_t entry;
         int kernel_size;
 
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &entry, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
+                               &entry, NULL, NULL,
                                1, EM_LATTICEMICO32, 0, 0);
         reset_info->bootstrap_pc = entry;
 
@@ -231,7 +232,8 @@ static void lm32_uclinux_init(MachineState *machine)
         uint64_t entry;
         int kernel_size;
 
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &entry, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
+                               &entry, NULL, NULL,
                                1, EM_LATTICEMICO32, 0, 0);
         reset_info->bootstrap_pc = entry;
 
diff --git a/hw/lm32/milkymist.c b/hw/lm32/milkymist.c
index 63c6894c9559..7b0046b3e821 100644
--- a/hw/lm32/milkymist.c
+++ b/hw/lm32/milkymist.c
@@ -175,7 +175,8 @@ milkymist_init(MachineState *machine)
         uint64_t entry;
 
         /* Boots a kernel elf binary.  */
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &entry, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
+                               &entry, NULL, NULL,
                                1, EM_LATTICEMICO32, 0, 0);
         reset_info->bootstrap_pc = entry;
 
diff --git a/hw/m68k/an5206.c b/hw/m68k/an5206.c
index 5e067ea1c356..06e380325885 100644
--- a/hw/m68k/an5206.c
+++ b/hw/m68k/an5206.c
@@ -66,7 +66,7 @@ static void an5206_init(MachineState *machine)
         exit(1);
     }
 
-    kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+    kernel_size = load_elf(kernel_filename, NULL, NULL, NULL, &elf_entry,
                            NULL, NULL, 1, EM_68K, 0, 0);
     entry = elf_entry;
     if (kernel_size < 0) {
diff --git a/hw/m68k/mcf5208.c b/hw/m68k/mcf5208.c
index 0f2245dd8177..8531e07e5b57 100644
--- a/hw/m68k/mcf5208.c
+++ b/hw/m68k/mcf5208.c
@@ -294,7 +294,7 @@ static void mcf5208evb_init(MachineState *machine)
         exit(1);
     }
 
-    kernel_size = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+    kernel_size = load_elf(kernel_filename, NULL, NULL, NULL, &elf_entry,
                            NULL, NULL, 1, EM_68K, 0, 0);
     entry = elf_entry;
     if (kernel_size < 0) {
diff --git a/hw/microblaze/boot.c b/hw/microblaze/boot.c
index 35bfeda7aa71..54c646810aa5 100644
--- a/hw/microblaze/boot.c
+++ b/hw/microblaze/boot.c
@@ -142,13 +142,14 @@ void microblaze_load_kernel(MicroBlazeCPU *cpu, hwaddr ddr_base,
 #endif
 
         /* Boots a kernel elf binary.  */
-        kernel_size = load_elf(kernel_filename, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
                                &entry, &low, &high,
                                big_endian, EM_MICROBLAZE, 0, 0);
         base32 = entry;
         if (base32 == 0xc0000000) {
-            kernel_size = load_elf(kernel_filename, translate_kernel_address,
-                                   NULL, &entry, NULL, NULL,
+            kernel_size = load_elf(kernel_filename, NULL,
+                                   translate_kernel_address, NULL,
+                                   &entry, NULL, NULL,
                                    big_endian, EM_MICROBLAZE, 0, 0);
         }
         /* Always boot into physical ram.  */
diff --git a/hw/mips/mips_fulong2e.c b/hw/mips/mips_fulong2e.c
index 2fbba32c4819..1f24a9fc2e13 100644
--- a/hw/mips/mips_fulong2e.c
+++ b/hw/mips/mips_fulong2e.c
@@ -111,8 +111,9 @@ static int64_t load_kernel (CPUMIPSState *env)
     uint32_t *prom_buf;
     long prom_size;
 
-    kernel_size = load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys,
-                           NULL, (uint64_t *)&kernel_entry,
+    kernel_size = load_elf(loaderparams.kernel_filename, NULL,
+                           cpu_mips_kseg0_to_phys, NULL,
+                           (uint64_t *)&kernel_entry,
                            (uint64_t *)&kernel_low, (uint64_t *)&kernel_high,
                            0, EM_MIPS, 1, 0);
     if (kernel_size < 0) {
diff --git a/hw/mips/mips_malta.c b/hw/mips/mips_malta.c
index c1cf0fe12e95..74667766c277 100644
--- a/hw/mips/mips_malta.c
+++ b/hw/mips/mips_malta.c
@@ -1010,8 +1010,9 @@ static int64_t load_kernel (void)
     big_endian = 0;
 #endif
 
-    kernel_size = load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys,
-                           NULL, (uint64_t *)&kernel_entry, NULL,
+    kernel_size = load_elf(loaderparams.kernel_filename, NULL,
+                           cpu_mips_kseg0_to_phys, NULL,
+                           (uint64_t *)&kernel_entry, NULL,
                            (uint64_t *)&kernel_high, big_endian, EM_MIPS, 1, 0);
     if (kernel_size < 0) {
         error_report("could not load kernel '%s': %s",
diff --git a/hw/mips/mips_mipssim.c b/hw/mips/mips_mipssim.c
index f665752a2fc6..824abda65748 100644
--- a/hw/mips/mips_mipssim.c
+++ b/hw/mips/mips_mipssim.c
@@ -69,8 +69,9 @@ static int64_t load_kernel(void)
     big_endian = 0;
 #endif
 
-    kernel_size = load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys,
-                           NULL, (uint64_t *)&entry, NULL,
+    kernel_size = load_elf(loaderparams.kernel_filename, NULL,
+                           cpu_mips_kseg0_to_phys, NULL,
+                           (uint64_t *)&entry, NULL,
                            (uint64_t *)&kernel_high, big_endian,
                            EM_MIPS, 1, 0);
     if (kernel_size >= 0) {
diff --git a/hw/mips/mips_r4k.c b/hw/mips/mips_r4k.c
index 3e852e98cf9c..29eae06e9ad1 100644
--- a/hw/mips/mips_r4k.c
+++ b/hw/mips/mips_r4k.c
@@ -92,8 +92,9 @@ static int64_t load_kernel(void)
 #else
     big_endian = 0;
 #endif
-    kernel_size = load_elf(loaderparams.kernel_filename, cpu_mips_kseg0_to_phys,
-                           NULL, (uint64_t *)&entry, NULL,
+    kernel_size = load_elf(loaderparams.kernel_filename, NULL,
+                           cpu_mips_kseg0_to_phys, NULL,
+                           (uint64_t *)&entry, NULL,
                            (uint64_t *)&kernel_high, big_endian,
                            EM_MIPS, 1, 0);
     if (kernel_size >= 0) {
diff --git a/hw/moxie/moxiesim.c b/hw/moxie/moxiesim.c
index 4b0ce09c5ee5..db11c00677de 100644
--- a/hw/moxie/moxiesim.c
+++ b/hw/moxie/moxiesim.c
@@ -58,7 +58,7 @@ static void load_kernel(MoxieCPU *cpu, LoaderParams *loader_params)
     long kernel_size;
     ram_addr_t initrd_offset;
 
-    kernel_size = load_elf(loader_params->kernel_filename,  NULL, NULL,
+    kernel_size = load_elf(loader_params->kernel_filename,  NULL, NULL, NULL,
                            &entry, &kernel_low, &kernel_high, 1, EM_MOXIE,
                            0, 0);
 
diff --git a/hw/nios2/boot.c b/hw/nios2/boot.c
index 4bb5b601d3af..bc6a68cfa60a 100644
--- a/hw/nios2/boot.c
+++ b/hw/nios2/boot.c
@@ -146,13 +146,14 @@ void nios2_load_kernel(Nios2CPU *cpu, hwaddr ddr_base,
 #endif
 
         /* Boots a kernel elf binary. */
-        kernel_size = load_elf(kernel_filename, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
                                &entry, &low, &high,
                                big_endian, EM_ALTERA_NIOS2, 0, 0);
         base32 = entry;
         if (base32 == 0xc0000000) {
-            kernel_size = load_elf(kernel_filename, translate_kernel_address,
-                                   NULL, &entry, NULL, NULL,
+            kernel_size = load_elf(kernel_filename, NULL,
+                                   translate_kernel_address, NULL,
+                                   &entry, NULL, NULL,
                                    big_endian, EM_ALTERA_NIOS2, 0, 0);
         }
 
diff --git a/hw/openrisc/openrisc_sim.c b/hw/openrisc/openrisc_sim.c
index a495a84a41d9..7d3b734d24fb 100644
--- a/hw/openrisc/openrisc_sim.c
+++ b/hw/openrisc/openrisc_sim.c
@@ -96,7 +96,7 @@ static void openrisc_load_kernel(ram_addr_t ram_size,
     hwaddr entry;
 
     if (kernel_filename && !qtest_enabled()) {
-        kernel_size = load_elf(kernel_filename, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
                                &elf_entry, NULL, NULL, 1, EM_OPENRISC,
                                1, 0);
         entry = elf_entry;
diff --git a/hw/pci-host/prep.c b/hw/pci-host/prep.c
index b1b6b16badb3..8b9e1fd0d343 100644
--- a/hw/pci-host/prep.c
+++ b/hw/pci-host/prep.c
@@ -331,7 +331,7 @@ static void raven_realize(PCIDevice *d, Error **errp)
         filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, s->bios_name);
         if (filename) {
             if (s->elf_machine != EM_NONE) {
-                bios_size = load_elf(filename, NULL, NULL, NULL,
+                bios_size = load_elf(filename, NULL, NULL, NULL, NULL,
                                      NULL, NULL, 1, s->elf_machine, 0, 0);
             }
             if (bios_size < 0) {
diff --git a/hw/ppc/e500.c b/hw/ppc/e500.c
index e6747fce282a..28c77b693f59 100644
--- a/hw/ppc/e500.c
+++ b/hw/ppc/e500.c
@@ -990,7 +990,8 @@ void ppce500_init(MachineState *machine)
 
     filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, payload_name);
 
-    payload_size = load_elf(filename, NULL, NULL, &bios_entry, &loadaddr, NULL,
+    payload_size = load_elf(filename, NULL, NULL, NULL,
+                            &bios_entry, &loadaddr, NULL,
                             1, PPC_ELF_MACHINE, 0, 0);
     if (payload_size < 0) {
         /*
diff --git a/hw/ppc/mac_newworld.c b/hw/ppc/mac_newworld.c
index 7e45afae7c55..f5a68be6319f 100644
--- a/hw/ppc/mac_newworld.c
+++ b/hw/ppc/mac_newworld.c
@@ -165,7 +165,7 @@ static void ppc_core99_init(MachineState *machine)
 
     /* Load OpenBIOS (ELF) */
     if (filename) {
-        bios_size = load_elf(filename, NULL, NULL, NULL,
+        bios_size = load_elf(filename, NULL, NULL, NULL, NULL,
                              NULL, NULL, 1, PPC_ELF_MACHINE, 0, 0);
 
         g_free(filename);
@@ -188,7 +188,8 @@ static void ppc_core99_init(MachineState *machine)
 #endif
         kernel_base = KERNEL_LOAD_ADDR;
 
-        kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
+        kernel_size = load_elf(kernel_filename, NULL,
+                               translate_kernel_address, NULL,
                                NULL, &lowaddr, NULL, 1, PPC_ELF_MACHINE,
                                0, 0);
         if (kernel_size < 0)
diff --git a/hw/ppc/mac_oldworld.c b/hw/ppc/mac_oldworld.c
index 817f70e52cf3..c28dde1992f4 100644
--- a/hw/ppc/mac_oldworld.c
+++ b/hw/ppc/mac_oldworld.c
@@ -140,7 +140,7 @@ static void ppc_heathrow_init(MachineState *machine)
 
     /* Load OpenBIOS (ELF) */
     if (filename) {
-        bios_size = load_elf(filename, 0, NULL, NULL, NULL, NULL,
+        bios_size = load_elf(filename, NULL, 0, NULL, NULL, NULL, NULL,
                              1, PPC_ELF_MACHINE, 0, 0);
         g_free(filename);
     } else {
@@ -161,7 +161,8 @@ static void ppc_heathrow_init(MachineState *machine)
         bswap_needed = 0;
 #endif
         kernel_base = KERNEL_LOAD_ADDR;
-        kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
+        kernel_size = load_elf(kernel_filename, NULL,
+                               translate_kernel_address, NULL,
                                NULL, &lowaddr, NULL, 1, PPC_ELF_MACHINE,
                                0, 0);
         if (kernel_size < 0)
diff --git a/hw/ppc/ppc440_bamboo.c b/hw/ppc/ppc440_bamboo.c
index f5720f979e42..9bb71fbdcd4f 100644
--- a/hw/ppc/ppc440_bamboo.c
+++ b/hw/ppc/ppc440_bamboo.c
@@ -257,7 +257,7 @@ static void bamboo_init(MachineState *machine)
         success = load_uimage(kernel_filename, &entry, &loadaddr, NULL,
                               NULL, NULL);
         if (success < 0) {
-            success = load_elf(kernel_filename, NULL, NULL, &elf_entry,
+            success = load_elf(kernel_filename, NULL, NULL, NULL, &elf_entry,
                                &elf_lowaddr, NULL, 1, PPC_ELF_MACHINE,
                                0, 0);
             entry = elf_entry;
diff --git a/hw/ppc/sam460ex.c b/hw/ppc/sam460ex.c
index 5aac58f36ee1..4dfd47766aa6 100644
--- a/hw/ppc/sam460ex.c
+++ b/hw/ppc/sam460ex.c
@@ -557,7 +557,8 @@ static void sam460ex_init(MachineState *machine)
         if (success < 0) {
             uint64_t elf_entry, elf_lowaddr;
 
-            success = load_elf(machine->kernel_filename, NULL, NULL, &elf_entry,
+            success = load_elf(machine->kernel_filename, NULL,
+                               NULL, NULL, &elf_entry,
                                &elf_lowaddr, NULL, 1, PPC_ELF_MACHINE, 0, 0);
             entry = elf_entry;
             loadaddr = elf_lowaddr;
diff --git a/hw/ppc/spapr.c b/hw/ppc/spapr.c
index 55be0f56cbe2..6b4f1da197b1 100644
--- a/hw/ppc/spapr.c
+++ b/hw/ppc/spapr.c
@@ -2777,11 +2777,12 @@ static void spapr_machine_init(MachineState *machine)
     if (kernel_filename) {
         uint64_t lowaddr = 0;
 
-        spapr->kernel_size = load_elf(kernel_filename, translate_kernel_address,
-                                      NULL, NULL, &lowaddr, NULL, 1,
+        spapr->kernel_size = load_elf(kernel_filename, NULL,
+                                      translate_kernel_address, NULL,
+                                      NULL, &lowaddr, NULL, 1,
                                       PPC_ELF_MACHINE, 0, 0);
         if (spapr->kernel_size == ELF_LOAD_WRONG_ENDIAN) {
-            spapr->kernel_size = load_elf(kernel_filename,
+            spapr->kernel_size = load_elf(kernel_filename, NULL,
                                           translate_kernel_address, NULL, NULL,
                                           &lowaddr, NULL, 0, PPC_ELF_MACHINE,
                                           0, 0);
diff --git a/hw/ppc/virtex_ml507.c b/hw/ppc/virtex_ml507.c
index ee9b4b449086..9b383dc3d551 100644
--- a/hw/ppc/virtex_ml507.c
+++ b/hw/ppc/virtex_ml507.c
@@ -258,7 +258,7 @@ static void virtex_init(MachineState *machine)
         hwaddr boot_offset;
 
         /* Boots a kernel elf binary.  */
-        kernel_size = load_elf(kernel_filename, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
                                &entry, &low, &high, 1, PPC_ELF_MACHINE,
                                0, 0);
         boot_info.bootstrap_pc = entry & 0x00ffffff;
diff --git a/hw/riscv/sifive_e.c b/hw/riscv/sifive_e.c
index cb513cc3bb50..242773232e22 100644
--- a/hw/riscv/sifive_e.c
+++ b/hw/riscv/sifive_e.c
@@ -78,7 +78,7 @@ static uint64_t load_kernel(const char *kernel_filename)
 {
     uint64_t kernel_entry, kernel_high;
 
-    if (load_elf(kernel_filename, NULL, NULL,
+    if (load_elf(kernel_filename, NULL, NULL, NULL,
                  &kernel_entry, NULL, &kernel_high,
                  0, EM_RISCV, 1, 0) < 0) {
         error_report("could not load kernel '%s'", kernel_filename);
diff --git a/hw/riscv/sifive_u.c b/hw/riscv/sifive_u.c
index ef07df244241..21fbb732a74f 100644
--- a/hw/riscv/sifive_u.c
+++ b/hw/riscv/sifive_u.c
@@ -69,7 +69,7 @@ static uint64_t load_kernel(const char *kernel_filename)
 {
     uint64_t kernel_entry, kernel_high;
 
-    if (load_elf(kernel_filename, NULL, NULL,
+    if (load_elf(kernel_filename, NULL, NULL, NULL,
                  &kernel_entry, NULL, &kernel_high,
                  0, EM_RISCV, 1, 0) < 0) {
         error_report("could not load kernel '%s'", kernel_filename);
diff --git a/hw/riscv/spike.c b/hw/riscv/spike.c
index 268df04c3c7d..c66ffc50cc74 100644
--- a/hw/riscv/spike.c
+++ b/hw/riscv/spike.c
@@ -57,7 +57,7 @@ static uint64_t load_kernel(const char *kernel_filename)
 {
     uint64_t kernel_entry, kernel_high;
 
-    if (load_elf_ram_sym(kernel_filename, NULL, NULL,
+    if (load_elf_ram_sym(kernel_filename, NULL, NULL, NULL,
             &kernel_entry, NULL, &kernel_high, 0, EM_RISCV, 1, 0,
             NULL, true, htif_symbol_callback) < 0) {
         error_report("could not load kernel '%s'", kernel_filename);
diff --git a/hw/riscv/virt.c b/hw/riscv/virt.c
index 2b38f890702c..dcfbb99e4a16 100644
--- a/hw/riscv/virt.c
+++ b/hw/riscv/virt.c
@@ -61,7 +61,7 @@ static uint64_t load_kernel(const char *kernel_filename)
 {
     uint64_t kernel_entry, kernel_high;
 
-    if (load_elf(kernel_filename, NULL, NULL,
+    if (load_elf(kernel_filename, NULL, NULL, NULL,
                  &kernel_entry, NULL, &kernel_high,
                  0, EM_RISCV, 1, 0) < 0) {
         error_report("could not load kernel '%s'", kernel_filename);
diff --git a/hw/s390x/ipl.c b/hw/s390x/ipl.c
index 21f64ad26aae..896888bf8f00 100644
--- a/hw/s390x/ipl.c
+++ b/hw/s390x/ipl.c
@@ -131,7 +131,8 @@ static void s390_ipl_realize(DeviceState *dev, Error **errp)
             goto error;
         }
 
-        bios_size = load_elf(bios_filename, bios_translate_addr, &fwbase,
+        bios_size = load_elf(bios_filename, NULL,
+                             bios_translate_addr, &fwbase,
                              &ipl->bios_start_addr, NULL, NULL, 1,
                              EM_S390, 0, 0);
         if (bios_size > 0) {
@@ -155,7 +156,8 @@ static void s390_ipl_realize(DeviceState *dev, Error **errp)
     }
 
     if (ipl->kernel) {
-        kernel_size = load_elf(ipl->kernel, NULL, NULL, &pentry, NULL,
+        kernel_size = load_elf(ipl->kernel, NULL, NULL, NULL,
+                               &pentry, NULL,
                                NULL, 1, EM_S390, 0, 0);
         if (kernel_size < 0) {
             kernel_size = load_image_targphys(ipl->kernel, 0, ram_size);
@@ -436,7 +438,8 @@ static int load_netboot_image(Error **errp)
         goto unref_mr;
     }
 
-    img_size = load_elf_ram(netboot_filename, NULL, NULL, &ipl->start_addr,
+    img_size = load_elf_ram(netboot_filename, NULL, NULL, NULL,
+                            &ipl->start_addr,
                             NULL, NULL, 1, EM_S390, 0, 0, NULL, false);
 
     if (img_size < 0) {
diff --git a/hw/sparc/leon3.c b/hw/sparc/leon3.c
index fa98ab81776c..774639af3393 100644
--- a/hw/sparc/leon3.c
+++ b/hw/sparc/leon3.c
@@ -190,7 +190,8 @@ static void leon3_generic_hw_init(MachineState *machine)
         long     kernel_size;
         uint64_t entry;
 
-        kernel_size = load_elf(kernel_filename, NULL, NULL, &entry, NULL, NULL,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL,
+                               &entry, NULL, NULL,
                                1 /* big endian */, EM_SPARC, 0, 0);
         if (kernel_size < 0) {
             error_report("could not load kernel '%s'", kernel_filename);
diff --git a/hw/sparc/sun4m.c b/hw/sparc/sun4m.c
index 639906cca30c..c6a10cc8e1a2 100644
--- a/hw/sparc/sun4m.c
+++ b/hw/sparc/sun4m.c
@@ -243,7 +243,8 @@ static unsigned long sun4m_load_kernel(const char *kernel_filename,
 #else
         bswap_needed = 0;
 #endif
-        kernel_size = load_elf(kernel_filename, translate_kernel_address, NULL,
+        kernel_size = load_elf(kernel_filename, NULL,
+                               translate_kernel_address, NULL,
                                NULL, NULL, NULL, 1, EM_SPARC, 0, 0);
         if (kernel_size < 0)
             kernel_size = load_aout(kernel_filename, KERNEL_LOAD_ADDR,
@@ -693,7 +694,8 @@ static void prom_init(hwaddr addr, const char *bios_name)
     }
     filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
     if (filename) {
-        ret = load_elf(filename, translate_prom_address, &addr, NULL,
+        ret = load_elf(filename, NULL,
+                       translate_prom_address, &addr, NULL,
                        NULL, NULL, 1, EM_SPARC, 0, 0);
         if (ret < 0 || ret > PROM_SIZE_MAX) {
             ret = load_image_targphys(filename, addr, PROM_SIZE_MAX);
diff --git a/hw/sparc64/sun4u.c b/hw/sparc64/sun4u.c
index f76b19e4e93b..b9bd4be5d512 100644
--- a/hw/sparc64/sun4u.c
+++ b/hw/sparc64/sun4u.c
@@ -153,7 +153,7 @@ static uint64_t sun4u_load_kernel(const char *kernel_filename,
 #else
         bswap_needed = 0;
 #endif
-        kernel_size = load_elf(kernel_filename, NULL, NULL, kernel_entry,
+        kernel_size = load_elf(kernel_filename, NULL, NULL, NULL, kernel_entry,
                                kernel_addr, &kernel_top, 1, EM_SPARCV9, 0, 0);
         if (kernel_size < 0) {
             *kernel_addr = KERNEL_LOAD_ADDR;
@@ -411,7 +411,7 @@ static void prom_init(hwaddr addr, const char *bios_name)
     }
     filename = qemu_find_file(QEMU_FILE_TYPE_BIOS, bios_name);
     if (filename) {
-        ret = load_elf(filename, translate_prom_address, &addr,
+        ret = load_elf(filename, NULL, translate_prom_address, &addr,
                        NULL, NULL, NULL, 1, EM_SPARCV9, 0, 0);
         if (ret < 0 || ret > PROM_SIZE_MAX) {
             ret = load_image_targphys(filename, addr, PROM_SIZE_MAX);
diff --git a/hw/tricore/tricore_testboard.c b/hw/tricore/tricore_testboard.c
index a58096f05e72..003592af27a6 100644
--- a/hw/tricore/tricore_testboard.c
+++ b/hw/tricore/tricore_testboard.c
@@ -45,7 +45,7 @@ static void tricore_load_kernel(CPUTriCoreState *env)
     long kernel_size;
 
     kernel_size = load_elf(tricoretb_binfo.kernel_filename, NULL,
-                           NULL, &entry, NULL,
+                           NULL, NULL, &entry, NULL,
                            NULL, 0,
                            EM_TRICORE, 1, 0);
     if (kernel_size <= 0) {
diff --git a/hw/xtensa/sim.c b/hw/xtensa/sim.c
index b6ccb3cd4ae9..12c7437398e8 100644
--- a/hw/xtensa/sim.c
+++ b/hw/xtensa/sim.c
@@ -97,11 +97,15 @@ static void xtensa_sim_init(MachineState *machine)
         uint64_t elf_entry;
         uint64_t elf_lowaddr;
 #ifdef TARGET_WORDS_BIGENDIAN
-        int success = load_elf(kernel_filename, translate_phys_addr, cpu,
-                &elf_entry, &elf_lowaddr, NULL, 1, EM_XTENSA, 0, 0);
+        int success = load_elf(kernel_filename, NULL,
+                               translate_phys_addr, cpu,
+                               &elf_entry, &elf_lowaddr,
+                               NULL, 1, EM_XTENSA, 0, 0);
 #else
-        int success = load_elf(kernel_filename, translate_phys_addr, cpu,
-                &elf_entry, &elf_lowaddr, NULL, 0, EM_XTENSA, 0, 0);
+        int success = load_elf(kernel_filename, NULL,
+                               translate_phys_addr, cpu,
+                               &elf_entry, &elf_lowaddr,
+                               NULL, 0, EM_XTENSA, 0, 0);
 #endif
         if (success > 0) {
             env->pc = elf_entry;
diff --git a/hw/xtensa/xtfpga.c b/hw/xtensa/xtfpga.c
index 21094319a659..cec5bda3fc73 100644
--- a/hw/xtensa/xtfpga.c
+++ b/hw/xtensa/xtfpga.c
@@ -378,7 +378,7 @@ static void xtfpga_init(const XtfpgaBoardDesc *board, MachineState *machine)
 
         uint64_t elf_entry;
         uint64_t elf_lowaddr;
-        int success = load_elf(kernel_filename, translate_phys_addr, cpu,
+        int success = load_elf(kernel_filename, NULL, translate_phys_addr, cpu,
                 &elf_entry, &elf_lowaddr, NULL, be, EM_XTENSA, 0, 0);
         if (success > 0) {
             entry_point = elf_entry;
diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
index 74679ff8da3a..37d20a3800c1 100644
--- a/include/hw/elf_ops.h
+++ b/include/hw/elf_ops.h
@@ -266,6 +266,7 @@ fail:
 }
 
 static int glue(load_elf, SZ)(const char *name, int fd,
+                              uint64_t (*elf_note_fn)(void *, void *, bool),
                               uint64_t (*translate_fn)(void *, uint64_t),
                               void *translate_opaque,
                               int must_swab, uint64_t *pentry,
@@ -496,8 +497,30 @@ static int glue(load_elf, SZ)(const char *name, int fd,
                 high = addr + mem_size;
 
             data = NULL;
+
+        } else if (ph->p_type == PT_NOTE && elf_note_fn) {
+            struct elf_note *nhdr = NULL;
+
+            file_size = ph->p_filesz; /* Size of the range of ELF notes */
+            data = g_malloc0(file_size);
+            if (ph->p_filesz > 0) {
+                if (lseek(fd, ph->p_offset, SEEK_SET) < 0) {
+                    goto fail;
+                }
+                if (read(fd, data, file_size) != file_size) {
+                    goto fail;
+                }
+            }
+
+            if (nhdr != NULL) {
+                bool is64 =
+                    sizeof(struct elf_note) == sizeof(struct elf64_note);
+                elf_note_fn((void *)nhdr, (void *)&ph->p_align, is64);
+            }
+            g_free(data);
         }
     }
+
     g_free(phdr);
     if (lowaddr)
         *lowaddr = (uint64_t)(elf_sword)low;
diff --git a/include/hw/loader.h b/include/hw/loader.h
index 0a0ad808ea39..130e73c32b21 100644
--- a/include/hw/loader.h
+++ b/include/hw/loader.h
@@ -93,6 +93,8 @@ const char *load_elf_strerror(int error);
 
 /** load_elf_ram_sym:
  * @filename: Path of ELF file
+ * @elf_note_fn: optional function to parse ELF Note type
+ *               passed via @translate_opaque
  * @translate_fn: optional function to translate load addresses
  * @translate_opaque: opaque data passed to @translate_fn
  * @pentry: Populated with program entry point. Ignored if NULL.
@@ -125,6 +127,7 @@ typedef void (*symbol_fn_t)(const char *st_name, int st_info,
                             uint64_t st_value, uint64_t st_size);
 
 int load_elf_ram_sym(const char *filename,
+                     uint64_t (*elf_note_fn)(void *, void *, bool),
                      uint64_t (*translate_fn)(void *, uint64_t),
                      void *translate_opaque, uint64_t *pentry,
                      uint64_t *lowaddr, uint64_t *highaddr, int big_endian,
@@ -136,6 +139,7 @@ int load_elf_ram_sym(const char *filename,
  * symbol callback function
  */
 int load_elf_ram(const char *filename,
+                 uint64_t (*elf_note_fn)(void *, void *, bool),
                  uint64_t (*translate_fn)(void *, uint64_t),
                  void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
                  uint64_t *highaddr, int big_endian, int elf_machine,
@@ -146,6 +150,7 @@ int load_elf_ram(const char *filename,
  * Same as load_elf_ram(), but always loads the elf as ROM
  */
 int load_elf_as(const char *filename,
+                uint64_t (*elf_note_fn)(void *, void *, bool),
                 uint64_t (*translate_fn)(void *, uint64_t),
                 void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
                 uint64_t *highaddr, int big_endian, int elf_machine,
@@ -155,7 +160,9 @@ int load_elf_as(const char *filename,
  * Same as load_elf_as(), but doesn't allow the caller to specify an
  * AddressSpace.
  */
-int load_elf(const char *filename, uint64_t (*translate_fn)(void *, uint64_t),
+int load_elf(const char *filename,
+             uint64_t (*elf_note_fn)(void *, void *, bool),
+             uint64_t (*translate_fn)(void *, uint64_t),
              void *translate_opaque, uint64_t *pentry, uint64_t *lowaddr,
              uint64_t *highaddr, int big_endian, int elf_machine,
              int clear_lsb, int data_swab);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC v2 2/4] elf-ops.h: Add get_elf_note_type()
  2018-12-21 20:03 ` Liam Merwick
@ 2018-12-21 20:03   ` Liam Merwick
  -1 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2018-12-21 20:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, ehabkost, rth, xen-devel, sgarzare, mst, stefanha,
	maran.wilson, george.kennedy, boris.ostrovsky, liam.merwick

Introduce a routine which, given a pointer to a range of ELF Notes,
searches through them looking for a note matching the type specified
and returns a pointer to the matching ELF note.

Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com>
---
 include/hw/elf_ops.h | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
index 37d20a3800c1..ffbdfbe9c2d8 100644
--- a/include/hw/elf_ops.h
+++ b/include/hw/elf_ops.h
@@ -265,6 +265,49 @@ fail:
     return ret;
 }
 
+/* Given 'nhdr', a pointer to a range of ELF Notes, search through them
+ * for a note matching type 'elf_note_type' and return a pointer to
+ * the matching ELF note.
+ */
+static struct elf_note *glue(get_elf_note_type, SZ)(struct elf_note *nhdr,
+                                                    elf_word note_size,
+                                                    elf_word phdr_align,
+                                                    elf_word elf_note_type)
+{
+    elf_word nhdr_size = sizeof(struct elf_note);
+    elf_word elf_note_entry_offset = 0;
+    elf_word note_type;
+    elf_word nhdr_namesz;
+    elf_word nhdr_descsz;
+
+    if (nhdr == NULL) {
+        return NULL;
+    }
+
+    note_type = nhdr->n_type;
+    while (note_type != elf_note_type) {
+        nhdr_namesz = nhdr->n_namesz;
+        nhdr_descsz = nhdr->n_descsz;
+
+        elf_note_entry_offset = nhdr_size +
+            QEMU_ALIGN_UP(nhdr_namesz, phdr_align) +
+            QEMU_ALIGN_UP(nhdr_descsz, phdr_align);
+
+        /* If the offset calculated in this iteration exceeds the
+	 * supplied size, we are done and no matching note was found.
+	 */
+        if (elf_note_entry_offset > note_size) {
+            return NULL;
+        }
+
+        /* skip to the next ELF Note entry */
+        nhdr = (void *)nhdr + elf_note_entry_offset;
+        note_type = nhdr->n_type;
+    }
+
+    return nhdr;
+}
+
 static int glue(load_elf, SZ)(const char *name, int fd,
                               uint64_t (*elf_note_fn)(void *, void *, bool),
                               uint64_t (*translate_fn)(void *, uint64_t),
@@ -512,6 +555,13 @@ static int glue(load_elf, SZ)(const char *name, int fd,
                 }
             }
 
+	    /* Search the ELF notes to find one with a type matching the
+	     * value passed in via 'translate_opaque'
+	     */
+            nhdr = (struct elf_note *)data;
+	    assert(translate_opaque != NULL);
+            nhdr = glue(get_elf_note_type, SZ)(nhdr, file_size, ph->p_align,
+                                               *(uint64_t *)translate_opaque);
             if (nhdr != NULL) {
                 bool is64 =
                     sizeof(struct elf_note) == sizeof(struct elf64_note);
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v2 2/4] elf-ops.h: Add get_elf_note_type()
@ 2018-12-21 20:03   ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2018-12-21 20:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: liam.merwick, ehabkost, mst, maran.wilson, george.kennedy,
	stefanha, xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare

Introduce a routine which, given a pointer to a range of ELF Notes,
searches through them looking for a note matching the type specified
and returns a pointer to the matching ELF note.

Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com>
---
 include/hw/elf_ops.h | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 50 insertions(+)

diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
index 37d20a3800c1..ffbdfbe9c2d8 100644
--- a/include/hw/elf_ops.h
+++ b/include/hw/elf_ops.h
@@ -265,6 +265,49 @@ fail:
     return ret;
 }
 
+/* Given 'nhdr', a pointer to a range of ELF Notes, search through them
+ * for a note matching type 'elf_note_type' and return a pointer to
+ * the matching ELF note.
+ */
+static struct elf_note *glue(get_elf_note_type, SZ)(struct elf_note *nhdr,
+                                                    elf_word note_size,
+                                                    elf_word phdr_align,
+                                                    elf_word elf_note_type)
+{
+    elf_word nhdr_size = sizeof(struct elf_note);
+    elf_word elf_note_entry_offset = 0;
+    elf_word note_type;
+    elf_word nhdr_namesz;
+    elf_word nhdr_descsz;
+
+    if (nhdr == NULL) {
+        return NULL;
+    }
+
+    note_type = nhdr->n_type;
+    while (note_type != elf_note_type) {
+        nhdr_namesz = nhdr->n_namesz;
+        nhdr_descsz = nhdr->n_descsz;
+
+        elf_note_entry_offset = nhdr_size +
+            QEMU_ALIGN_UP(nhdr_namesz, phdr_align) +
+            QEMU_ALIGN_UP(nhdr_descsz, phdr_align);
+
+        /* If the offset calculated in this iteration exceeds the
+	 * supplied size, we are done and no matching note was found.
+	 */
+        if (elf_note_entry_offset > note_size) {
+            return NULL;
+        }
+
+        /* skip to the next ELF Note entry */
+        nhdr = (void *)nhdr + elf_note_entry_offset;
+        note_type = nhdr->n_type;
+    }
+
+    return nhdr;
+}
+
 static int glue(load_elf, SZ)(const char *name, int fd,
                               uint64_t (*elf_note_fn)(void *, void *, bool),
                               uint64_t (*translate_fn)(void *, uint64_t),
@@ -512,6 +555,13 @@ static int glue(load_elf, SZ)(const char *name, int fd,
                 }
             }
 
+	    /* Search the ELF notes to find one with a type matching the
+	     * value passed in via 'translate_opaque'
+	     */
+            nhdr = (struct elf_note *)data;
+	    assert(translate_opaque != NULL);
+            nhdr = glue(get_elf_note_type, SZ)(nhdr, file_size, ph->p_align,
+                                               *(uint64_t *)translate_opaque);
             if (nhdr != NULL) {
                 bool is64 =
                     sizeof(struct elf_note) == sizeof(struct elf64_note);
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC v2 3/4] pvh: Add x86/HVM direct boot ABI header file
  2018-12-21 20:03 ` Liam Merwick
@ 2018-12-21 20:03   ` Liam Merwick
  -1 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2018-12-21 20:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, ehabkost, rth, xen-devel, sgarzare, mst, stefanha,
	maran.wilson, george.kennedy, boris.ostrovsky, liam.merwick

From: Liam Merwick <Liam.Merwick@oracle.com>

The x86/HVM direct boot ABI permits Qemu to be able to boot directly
into the uncompressed Linux kernel binary with minimal firmware involvement.

	https://xenbits.xen.org/docs/unstable/misc/pvh.html

This commit adds the header file that defines the start_info struct
that needs to be populated in order to use this ABI.

The canonical version of start_info.h is in the Xen codebase.
(like QEMU, the Linux kernel uses a copy as well).

Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <Konrad.Wilk@oracle.com>
---
 include/hw/xen/start_info.h | 146 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 146 insertions(+)
 create mode 100644 include/hw/xen/start_info.h

diff --git a/include/hw/xen/start_info.h b/include/hw/xen/start_info.h
new file mode 100644
index 000000000000..348779eb10cd
--- /dev/null
+++ b/include/hw/xen/start_info.h
@@ -0,0 +1,146 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2016, Citrix Systems, Inc.
+ */
+
+#ifndef __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__
+#define __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__
+
+/*
+ * Start of day structure passed to PVH guests and to HVM guests in %ebx.
+ *
+ * NOTE: nothing will be loaded at physical address 0, so a 0 value in any
+ * of the address fields should be treated as not present.
+ *
+ *  0 +----------------+
+ *    | magic          | Contains the magic value XEN_HVM_START_MAGIC_VALUE
+ *    |                | ("xEn3" with the 0x80 bit of the "E" set).
+ *  4 +----------------+
+ *    | version        | Version of this structure. Current version is 1. New
+ *    |                | versions are guaranteed to be backwards-compatible.
+ *  8 +----------------+
+ *    | flags          | SIF_xxx flags.
+ * 12 +----------------+
+ *    | nr_modules     | Number of modules passed to the kernel.
+ * 16 +----------------+
+ *    | modlist_paddr  | Physical address of an array of modules
+ *    |                | (layout of the structure below).
+ * 24 +----------------+
+ *    | cmdline_paddr  | Physical address of the command line,
+ *    |                | a zero-terminated ASCII string.
+ * 32 +----------------+
+ *    | rsdp_paddr     | Physical address of the RSDP ACPI data structure.
+ * 40 +----------------+
+ *    | memmap_paddr   | Physical address of the (optional) memory map. Only
+ *    |                | present in version 1 and newer of the structure.
+ * 48 +----------------+
+ *    | memmap_entries | Number of entries in the memory map table. Only
+ *    |                | present in version 1 and newer of the structure.
+ *    |                | Zero if there is no memory map being provided.
+ * 52 +----------------+
+ *    | reserved       | Version 1 and newer only.
+ * 56 +----------------+
+ *
+ * The layout of each entry in the module structure is the following:
+ *
+ *  0 +----------------+
+ *    | paddr          | Physical address of the module.
+ *  8 +----------------+
+ *    | size           | Size of the module in bytes.
+ * 16 +----------------+
+ *    | cmdline_paddr  | Physical address of the command line,
+ *    |                | a zero-terminated ASCII string.
+ * 24 +----------------+
+ *    | reserved       |
+ * 32 +----------------+
+ *
+ * The layout of each entry in the memory map table is as follows:
+ *
+ *  0 +----------------+
+ *    | addr           | Base address
+ *  8 +----------------+
+ *    | size           | Size of mapping in bytes
+ * 16 +----------------+
+ *    | type           | Type of mapping as defined between the hypervisor
+ *    |                | and guest it's starting. E820_TYPE_xxx, for example.
+ * 20 +----------------|
+ *    | reserved       |
+ * 24 +----------------+
+ *
+ * The address and sizes are always a 64bit little endian unsigned integer.
+ *
+ * NB: Xen on x86 will always try to place all the data below the 4GiB
+ * boundary.
+ *
+ * Version numbers of the hvm_start_info structure have evolved like this:
+ *
+ * Version 0:
+ *
+ * Version 1:   Added the memmap_paddr/memmap_entries fields (plus 4 bytes of
+ *              padding) to the end of the hvm_start_info struct. These new
+ *              fields can be used to pass a memory map to the guest. The
+ *              memory map is optional and so guests that understand version 1
+ *              of the structure must check that memmap_entries is non-zero
+ *              before trying to read the memory map.
+ */
+#define XEN_HVM_START_MAGIC_VALUE 0x336ec578
+
+/*
+ * C representation of the x86/HVM start info layout.
+ *
+ * The canonical definition of this layout is above, this is just a way to
+ * represent the layout described there using C types.
+ */
+struct hvm_start_info {
+    uint32_t magic;             /* Contains the magic value 0x336ec578       */
+                                /* ("xEn3" with the 0x80 bit of the "E" set).*/
+    uint32_t version;           /* Version of this structure.                */
+    uint32_t flags;             /* SIF_xxx flags.                            */
+    uint32_t nr_modules;        /* Number of modules passed to the kernel.   */
+    uint64_t modlist_paddr;     /* Physical address of an array of           */
+                                /* hvm_modlist_entry.                        */
+    uint64_t cmdline_paddr;     /* Physical address of the command line.     */
+    uint64_t rsdp_paddr;        /* Physical address of the RSDP ACPI data    */
+                                /* structure.                                */
+    uint64_t memmap_paddr;      /* Physical address of an array of           */
+                                /* hvm_memmap_table_entry. Only present in   */
+                                /* version 1 and newer of the structure      */
+    uint32_t memmap_entries;    /* Number of entries in the memmap table.    */
+                                /* Only present in version 1 and newer of    */
+                                /* the structure. Value will be zero if      */
+                                /* there is no memory map being provided.    */
+    uint32_t reserved;
+};
+
+struct hvm_modlist_entry {
+    uint64_t paddr;             /* Physical address of the module.           */
+    uint64_t size;              /* Size of the module in bytes.              */
+    uint64_t cmdline_paddr;     /* Physical address of the command line.     */
+    uint64_t reserved;
+};
+
+struct hvm_memmap_table_entry {
+    uint64_t addr;              /* Base address of the memory region         */
+    uint64_t size;              /* Size of the memory region in bytes        */
+    uint32_t type;              /* Mapping type                              */
+    uint32_t reserved;
+};
+
+#endif /* __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__ */
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v2 3/4] pvh: Add x86/HVM direct boot ABI header file
@ 2018-12-21 20:03   ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2018-12-21 20:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: liam.merwick, ehabkost, mst, maran.wilson, george.kennedy,
	stefanha, xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare

From: Liam Merwick <Liam.Merwick@oracle.com>

The x86/HVM direct boot ABI permits Qemu to be able to boot directly
into the uncompressed Linux kernel binary with minimal firmware involvement.

	https://xenbits.xen.org/docs/unstable/misc/pvh.html

This commit adds the header file that defines the start_info struct
that needs to be populated in order to use this ABI.

The canonical version of start_info.h is in the Xen codebase.
(like QEMU, the Linux kernel uses a copy as well).

Signed-off-by: Liam Merwick <Liam.Merwick@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <Konrad.Wilk@oracle.com>
---
 include/hw/xen/start_info.h | 146 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 146 insertions(+)
 create mode 100644 include/hw/xen/start_info.h

diff --git a/include/hw/xen/start_info.h b/include/hw/xen/start_info.h
new file mode 100644
index 000000000000..348779eb10cd
--- /dev/null
+++ b/include/hw/xen/start_info.h
@@ -0,0 +1,146 @@
+/*
+ * Permission is hereby granted, free of charge, to any person obtaining a copy
+ * of this software and associated documentation files (the "Software"), to
+ * deal in the Software without restriction, including without limitation the
+ * rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
+ * sell copies of the Software, and to permit persons to whom the Software is
+ * furnished to do so, subject to the following conditions:
+ *
+ * The above copyright notice and this permission notice shall be included in
+ * all copies or substantial portions of the Software.
+ *
+ * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
+ * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
+ * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
+ * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
+ * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
+ * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER
+ * DEALINGS IN THE SOFTWARE.
+ *
+ * Copyright (c) 2016, Citrix Systems, Inc.
+ */
+
+#ifndef __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__
+#define __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__
+
+/*
+ * Start of day structure passed to PVH guests and to HVM guests in %ebx.
+ *
+ * NOTE: nothing will be loaded at physical address 0, so a 0 value in any
+ * of the address fields should be treated as not present.
+ *
+ *  0 +----------------+
+ *    | magic          | Contains the magic value XEN_HVM_START_MAGIC_VALUE
+ *    |                | ("xEn3" with the 0x80 bit of the "E" set).
+ *  4 +----------------+
+ *    | version        | Version of this structure. Current version is 1. New
+ *    |                | versions are guaranteed to be backwards-compatible.
+ *  8 +----------------+
+ *    | flags          | SIF_xxx flags.
+ * 12 +----------------+
+ *    | nr_modules     | Number of modules passed to the kernel.
+ * 16 +----------------+
+ *    | modlist_paddr  | Physical address of an array of modules
+ *    |                | (layout of the structure below).
+ * 24 +----------------+
+ *    | cmdline_paddr  | Physical address of the command line,
+ *    |                | a zero-terminated ASCII string.
+ * 32 +----------------+
+ *    | rsdp_paddr     | Physical address of the RSDP ACPI data structure.
+ * 40 +----------------+
+ *    | memmap_paddr   | Physical address of the (optional) memory map. Only
+ *    |                | present in version 1 and newer of the structure.
+ * 48 +----------------+
+ *    | memmap_entries | Number of entries in the memory map table. Only
+ *    |                | present in version 1 and newer of the structure.
+ *    |                | Zero if there is no memory map being provided.
+ * 52 +----------------+
+ *    | reserved       | Version 1 and newer only.
+ * 56 +----------------+
+ *
+ * The layout of each entry in the module structure is the following:
+ *
+ *  0 +----------------+
+ *    | paddr          | Physical address of the module.
+ *  8 +----------------+
+ *    | size           | Size of the module in bytes.
+ * 16 +----------------+
+ *    | cmdline_paddr  | Physical address of the command line,
+ *    |                | a zero-terminated ASCII string.
+ * 24 +----------------+
+ *    | reserved       |
+ * 32 +----------------+
+ *
+ * The layout of each entry in the memory map table is as follows:
+ *
+ *  0 +----------------+
+ *    | addr           | Base address
+ *  8 +----------------+
+ *    | size           | Size of mapping in bytes
+ * 16 +----------------+
+ *    | type           | Type of mapping as defined between the hypervisor
+ *    |                | and guest it's starting. E820_TYPE_xxx, for example.
+ * 20 +----------------|
+ *    | reserved       |
+ * 24 +----------------+
+ *
+ * The address and sizes are always a 64bit little endian unsigned integer.
+ *
+ * NB: Xen on x86 will always try to place all the data below the 4GiB
+ * boundary.
+ *
+ * Version numbers of the hvm_start_info structure have evolved like this:
+ *
+ * Version 0:
+ *
+ * Version 1:   Added the memmap_paddr/memmap_entries fields (plus 4 bytes of
+ *              padding) to the end of the hvm_start_info struct. These new
+ *              fields can be used to pass a memory map to the guest. The
+ *              memory map is optional and so guests that understand version 1
+ *              of the structure must check that memmap_entries is non-zero
+ *              before trying to read the memory map.
+ */
+#define XEN_HVM_START_MAGIC_VALUE 0x336ec578
+
+/*
+ * C representation of the x86/HVM start info layout.
+ *
+ * The canonical definition of this layout is above, this is just a way to
+ * represent the layout described there using C types.
+ */
+struct hvm_start_info {
+    uint32_t magic;             /* Contains the magic value 0x336ec578       */
+                                /* ("xEn3" with the 0x80 bit of the "E" set).*/
+    uint32_t version;           /* Version of this structure.                */
+    uint32_t flags;             /* SIF_xxx flags.                            */
+    uint32_t nr_modules;        /* Number of modules passed to the kernel.   */
+    uint64_t modlist_paddr;     /* Physical address of an array of           */
+                                /* hvm_modlist_entry.                        */
+    uint64_t cmdline_paddr;     /* Physical address of the command line.     */
+    uint64_t rsdp_paddr;        /* Physical address of the RSDP ACPI data    */
+                                /* structure.                                */
+    uint64_t memmap_paddr;      /* Physical address of an array of           */
+                                /* hvm_memmap_table_entry. Only present in   */
+                                /* version 1 and newer of the structure      */
+    uint32_t memmap_entries;    /* Number of entries in the memmap table.    */
+                                /* Only present in version 1 and newer of    */
+                                /* the structure. Value will be zero if      */
+                                /* there is no memory map being provided.    */
+    uint32_t reserved;
+};
+
+struct hvm_modlist_entry {
+    uint64_t paddr;             /* Physical address of the module.           */
+    uint64_t size;              /* Size of the module in bytes.              */
+    uint64_t cmdline_paddr;     /* Physical address of the command line.     */
+    uint64_t reserved;
+};
+
+struct hvm_memmap_table_entry {
+    uint64_t addr;              /* Base address of the memory region         */
+    uint64_t size;              /* Size of the memory region in bytes        */
+    uint32_t type;              /* Mapping type                              */
+    uint32_t reserved;
+};
+
+#endif /* __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__ */
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [Qemu-devel] [RFC v2 4/4] pvh: Boot uncompressed kernel using direct boot ABI
  2018-12-21 20:03 ` Liam Merwick
@ 2018-12-21 20:03   ` Liam Merwick
  -1 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2018-12-21 20:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: pbonzini, ehabkost, rth, xen-devel, sgarzare, mst, stefanha,
	maran.wilson, george.kennedy, boris.ostrovsky, liam.merwick

These changes (along with corresponding Linux kernel and qboot changes)
enable a guest to be booted using the x86/HVM direct boot ABI.

This commit adds a load_elfboot() routine to pass the size and
location of the kernel entry point to qboot (which will fill in
the start_info struct information needed to to boot the guest).
Having loaded the ELF binary, load_linux() will run qboot
which continues the boot.

The address for the kernel entry point is read from an ELF Note
in the uncompressed kernel binary by a helper routine passed
to load_elf().

Co-developed-by: George Kennedy <George.Kennedy@oracle.com>
Signed-off-by: George Kennedy <George.Kennedy@oracle.com>
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
---
 hw/i386/pc.c  | 136 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 include/elf.h |  10 +++++
 2 files changed, 145 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 115bc2825ce4..6d44a14da44d 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -54,6 +54,7 @@
 #include "sysemu/qtest.h"
 #include "kvm_i386.h"
 #include "hw/xen/xen.h"
+#include "hw/xen/start_info.h"
 #include "ui/qemu-spice.h"
 #include "exec/memory.h"
 #include "exec/address-spaces.h"
@@ -109,6 +110,9 @@ static struct e820_entry *e820_table;
 static unsigned e820_entries;
 struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
 
+/* Physical Address of PVH entry point read from kernel ELF NOTE */
+static size_t pvh_start_addr;
+
 void gsi_handler(void *opaque, int n, int level)
 {
     GSIState *s = opaque;
@@ -834,6 +838,109 @@ struct setup_data {
     uint8_t data[0];
 } __attribute__((packed));
 
+
+/*
+ * The entry point into the kernel for PVH boot is different from
+ * the native entry point.  The PVH entry is defined by the x86/HVM
+ * direct boot ABI and is available in an ELFNOTE in the kernel binary.
+ *
+ * This function is passed to load_elf() when it is called from
+ * load_elfboot() which then additionally checks for an ELF Note of
+ * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
+ * parse the PVH entry address from the ELF Note.
+ *
+ * Due to trickery in elf_opts.h, load_elf() is actually available as
+ * load_elf32() or load_elf64() and this routine needs to be able
+ * to deal with being called as 32 or 64 bit.
+ *
+ * The address of the PVH entry point is saved to the 'pvh_start_addr'
+ * global variable.  (although the entry point is 32-bit, the kernel
+ * binary can be either 32-bit or 64-bit).
+ */
+static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
+{
+    size_t *elf_note_data_addr;
+
+    /* Check if ELF Note header passed in is valid */
+    if (arg1 == NULL) {
+        return 0;
+    }
+
+    if (is64) {
+        struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
+        uint64_t nhdr_size64 = sizeof(struct elf64_note);
+        uint64_t phdr_align = *(uint64_t *)arg2;
+        uint64_t nhdr_namesz = nhdr64->n_namesz;
+
+        elf_note_data_addr =
+            ((void *)nhdr64) + nhdr_size64 +
+            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
+    } else {
+        struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
+        uint32_t nhdr_size32 = sizeof(struct elf32_note);
+        uint32_t phdr_align = *(uint32_t *)arg2;
+        uint32_t nhdr_namesz = nhdr32->n_namesz;
+
+        elf_note_data_addr =
+            ((void *)nhdr32) + nhdr_size32 +
+            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
+    }
+
+    pvh_start_addr = *elf_note_data_addr;
+
+    return pvh_start_addr;
+}
+
+static bool load_elfboot(const char *kernel_filename,
+                   int kernel_file_size,
+                   uint8_t *header,
+                   size_t pvh_xen_start_addr,
+                   FWCfgState *fw_cfg)
+{
+    uint32_t flags = 0;
+    uint32_t mh_load_addr = 0;
+    uint32_t elf_kernel_size = 0;
+    uint64_t elf_entry;
+    uint64_t elf_low, elf_high;
+    int kernel_size;
+
+    if (ldl_p(header) != 0x464c457f) {
+        return false; /* no elfboot */
+    }
+
+    bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
+    flags = elf_is64 ?
+        ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
+
+    if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
+        error_report("elfboot unsupported flags = %x", flags);
+        exit(1);
+    }
+
+    uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
+    kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
+                           NULL, &elf_note_type, &elf_entry,
+                           &elf_low, &elf_high, 0, I386_ELF_MACHINE,
+                           0, 0);
+
+    if (kernel_size < 0) {
+        error_report("Error while loading elf kernel");
+        exit(1);
+    }
+    mh_load_addr = elf_low;
+    elf_kernel_size = elf_high - elf_low;
+
+    if (pvh_start_addr == 0) {
+        error_report("Error loading uncompressed kernel without PVH ELF Note");
+        exit(1);
+    }
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
+
+    return true;
+}
+
 static void load_linux(PCMachineState *pcms,
                        FWCfgState *fw_cfg)
 {
@@ -873,6 +980,33 @@ static void load_linux(PCMachineState *pcms,
     if (ldl_p(header+0x202) == 0x53726448) {
         protocol = lduw_p(header+0x206);
     } else {
+        /*
+         * If the kernel address for using the x86/HVM direct boot ABI has
+         * been saved then proceed with booting the uncompressed kernel
+         */
+        if (load_elfboot(kernel_filename, kernel_size,
+                         header, pvh_start_addr, fw_cfg)) {
+            struct hvm_modlist_entry ramdisk_mod = { 0 };
+
+            fclose(f);
+
+            fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
+                strlen(kernel_cmdline) + 1);
+            fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
+
+            assert(machine->device_memory != NULL);
+            ramdisk_mod.paddr = machine->device_memory->base;
+            ramdisk_mod.size =
+                memory_region_size(&machine->device_memory->mr);
+
+            fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, &ramdisk_mod,
+                             sizeof(ramdisk_mod));
+            fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
+            fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
+                             header, sizeof(header));
+
+            return;
+        }
         /* This looks like a multiboot kernel. If it is, let's stop
            treating it like a Linux kernel. */
         if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
@@ -1336,7 +1470,7 @@ void pc_memory_init(PCMachineState *pcms,
     int linux_boot, i;
     MemoryRegion *ram, *option_rom_mr;
     MemoryRegion *ram_below_4g, *ram_above_4g;
-    FWCfgState *fw_cfg;
+    FWCfgState *fw_cfg = NULL;
     MachineState *machine = MACHINE(pcms);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
 
diff --git a/include/elf.h b/include/elf.h
index c151164b63da..1f82c7a7124b 100644
--- a/include/elf.h
+++ b/include/elf.h
@@ -1585,6 +1585,16 @@ typedef struct elf64_shdr {
 #define NT_ARM_HW_WATCH 0x403           /* ARM hardware watchpoint registers */
 #define NT_ARM_SYSTEM_CALL      0x404   /* ARM system call number */
 
+/*
+ * Physical entry point into the kernel.
+ *
+ * 32bit entry point into the kernel. When requested to launch the
+ * guest kernel, use this entry point to launch the guest in 32-bit
+ * protected mode with paging disabled.
+ *
+ * [ Corresponding definition in Linux kernel: include/xen/interface/elfnote.h ]
+ */
+#define XEN_ELFNOTE_PHYS32_ENTRY    18  /* 0x12 */
 
 /* Note header in a PT_NOTE section */
 typedef struct elf32_note {
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC v2 4/4] pvh: Boot uncompressed kernel using direct boot ABI
@ 2018-12-21 20:03   ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2018-12-21 20:03 UTC (permalink / raw)
  To: qemu-devel
  Cc: liam.merwick, ehabkost, mst, maran.wilson, george.kennedy,
	stefanha, xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare

These changes (along with corresponding Linux kernel and qboot changes)
enable a guest to be booted using the x86/HVM direct boot ABI.

This commit adds a load_elfboot() routine to pass the size and
location of the kernel entry point to qboot (which will fill in
the start_info struct information needed to to boot the guest).
Having loaded the ELF binary, load_linux() will run qboot
which continues the boot.

The address for the kernel entry point is read from an ELF Note
in the uncompressed kernel binary by a helper routine passed
to load_elf().

Co-developed-by: George Kennedy <George.Kennedy@oracle.com>
Signed-off-by: George Kennedy <George.Kennedy@oracle.com>
Signed-off-by: Liam Merwick <liam.merwick@oracle.com>
---
 hw/i386/pc.c  | 136 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 include/elf.h |  10 +++++
 2 files changed, 145 insertions(+), 1 deletion(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 115bc2825ce4..6d44a14da44d 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -54,6 +54,7 @@
 #include "sysemu/qtest.h"
 #include "kvm_i386.h"
 #include "hw/xen/xen.h"
+#include "hw/xen/start_info.h"
 #include "ui/qemu-spice.h"
 #include "exec/memory.h"
 #include "exec/address-spaces.h"
@@ -109,6 +110,9 @@ static struct e820_entry *e820_table;
 static unsigned e820_entries;
 struct hpet_fw_config hpet_cfg = {.count = UINT8_MAX};
 
+/* Physical Address of PVH entry point read from kernel ELF NOTE */
+static size_t pvh_start_addr;
+
 void gsi_handler(void *opaque, int n, int level)
 {
     GSIState *s = opaque;
@@ -834,6 +838,109 @@ struct setup_data {
     uint8_t data[0];
 } __attribute__((packed));
 
+
+/*
+ * The entry point into the kernel for PVH boot is different from
+ * the native entry point.  The PVH entry is defined by the x86/HVM
+ * direct boot ABI and is available in an ELFNOTE in the kernel binary.
+ *
+ * This function is passed to load_elf() when it is called from
+ * load_elfboot() which then additionally checks for an ELF Note of
+ * type XEN_ELFNOTE_PHYS32_ENTRY and passes it to this function to
+ * parse the PVH entry address from the ELF Note.
+ *
+ * Due to trickery in elf_opts.h, load_elf() is actually available as
+ * load_elf32() or load_elf64() and this routine needs to be able
+ * to deal with being called as 32 or 64 bit.
+ *
+ * The address of the PVH entry point is saved to the 'pvh_start_addr'
+ * global variable.  (although the entry point is 32-bit, the kernel
+ * binary can be either 32-bit or 64-bit).
+ */
+static uint64_t read_pvh_start_addr(void *arg1, void *arg2, bool is64)
+{
+    size_t *elf_note_data_addr;
+
+    /* Check if ELF Note header passed in is valid */
+    if (arg1 == NULL) {
+        return 0;
+    }
+
+    if (is64) {
+        struct elf64_note *nhdr64 = (struct elf64_note *)arg1;
+        uint64_t nhdr_size64 = sizeof(struct elf64_note);
+        uint64_t phdr_align = *(uint64_t *)arg2;
+        uint64_t nhdr_namesz = nhdr64->n_namesz;
+
+        elf_note_data_addr =
+            ((void *)nhdr64) + nhdr_size64 +
+            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
+    } else {
+        struct elf32_note *nhdr32 = (struct elf32_note *)arg1;
+        uint32_t nhdr_size32 = sizeof(struct elf32_note);
+        uint32_t phdr_align = *(uint32_t *)arg2;
+        uint32_t nhdr_namesz = nhdr32->n_namesz;
+
+        elf_note_data_addr =
+            ((void *)nhdr32) + nhdr_size32 +
+            QEMU_ALIGN_UP(nhdr_namesz, phdr_align);
+    }
+
+    pvh_start_addr = *elf_note_data_addr;
+
+    return pvh_start_addr;
+}
+
+static bool load_elfboot(const char *kernel_filename,
+                   int kernel_file_size,
+                   uint8_t *header,
+                   size_t pvh_xen_start_addr,
+                   FWCfgState *fw_cfg)
+{
+    uint32_t flags = 0;
+    uint32_t mh_load_addr = 0;
+    uint32_t elf_kernel_size = 0;
+    uint64_t elf_entry;
+    uint64_t elf_low, elf_high;
+    int kernel_size;
+
+    if (ldl_p(header) != 0x464c457f) {
+        return false; /* no elfboot */
+    }
+
+    bool elf_is64 = header[EI_CLASS] == ELFCLASS64;
+    flags = elf_is64 ?
+        ((Elf64_Ehdr *)header)->e_flags : ((Elf32_Ehdr *)header)->e_flags;
+
+    if (flags & 0x00010004) { /* LOAD_ELF_HEADER_HAS_ADDR */
+        error_report("elfboot unsupported flags = %x", flags);
+        exit(1);
+    }
+
+    uint64_t elf_note_type = XEN_ELFNOTE_PHYS32_ENTRY;
+    kernel_size = load_elf(kernel_filename, read_pvh_start_addr,
+                           NULL, &elf_note_type, &elf_entry,
+                           &elf_low, &elf_high, 0, I386_ELF_MACHINE,
+                           0, 0);
+
+    if (kernel_size < 0) {
+        error_report("Error while loading elf kernel");
+        exit(1);
+    }
+    mh_load_addr = elf_low;
+    elf_kernel_size = elf_high - elf_low;
+
+    if (pvh_start_addr == 0) {
+        error_report("Error loading uncompressed kernel without PVH ELF Note");
+        exit(1);
+    }
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ENTRY, pvh_start_addr);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_ADDR, mh_load_addr);
+    fw_cfg_add_i32(fw_cfg, FW_CFG_KERNEL_SIZE, elf_kernel_size);
+
+    return true;
+}
+
 static void load_linux(PCMachineState *pcms,
                        FWCfgState *fw_cfg)
 {
@@ -873,6 +980,33 @@ static void load_linux(PCMachineState *pcms,
     if (ldl_p(header+0x202) == 0x53726448) {
         protocol = lduw_p(header+0x206);
     } else {
+        /*
+         * If the kernel address for using the x86/HVM direct boot ABI has
+         * been saved then proceed with booting the uncompressed kernel
+         */
+        if (load_elfboot(kernel_filename, kernel_size,
+                         header, pvh_start_addr, fw_cfg)) {
+            struct hvm_modlist_entry ramdisk_mod = { 0 };
+
+            fclose(f);
+
+            fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
+                strlen(kernel_cmdline) + 1);
+            fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
+
+            assert(machine->device_memory != NULL);
+            ramdisk_mod.paddr = machine->device_memory->base;
+            ramdisk_mod.size =
+                memory_region_size(&machine->device_memory->mr);
+
+            fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, &ramdisk_mod,
+                             sizeof(ramdisk_mod));
+            fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
+            fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
+                             header, sizeof(header));
+
+            return;
+        }
         /* This looks like a multiboot kernel. If it is, let's stop
            treating it like a Linux kernel. */
         if (load_multiboot(fw_cfg, f, kernel_filename, initrd_filename,
@@ -1336,7 +1470,7 @@ void pc_memory_init(PCMachineState *pcms,
     int linux_boot, i;
     MemoryRegion *ram, *option_rom_mr;
     MemoryRegion *ram_below_4g, *ram_above_4g;
-    FWCfgState *fw_cfg;
+    FWCfgState *fw_cfg = NULL;
     MachineState *machine = MACHINE(pcms);
     PCMachineClass *pcmc = PC_MACHINE_GET_CLASS(pcms);
 
diff --git a/include/elf.h b/include/elf.h
index c151164b63da..1f82c7a7124b 100644
--- a/include/elf.h
+++ b/include/elf.h
@@ -1585,6 +1585,16 @@ typedef struct elf64_shdr {
 #define NT_ARM_HW_WATCH 0x403           /* ARM hardware watchpoint registers */
 #define NT_ARM_SYSTEM_CALL      0x404   /* ARM system call number */
 
+/*
+ * Physical entry point into the kernel.
+ *
+ * 32bit entry point into the kernel. When requested to launch the
+ * guest kernel, use this entry point to launch the guest in 32-bit
+ * protected mode with paging disabled.
+ *
+ * [ Corresponding definition in Linux kernel: include/xen/interface/elfnote.h ]
+ */
+#define XEN_ELFNOTE_PHYS32_ENTRY    18  /* 0x12 */
 
 /* Note header in a PT_NOTE section */
 typedef struct elf32_note {
-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot
  2018-12-21 20:03 ` Liam Merwick
@ 2018-12-26 17:08   ` no-reply
  -1 siblings, 0 replies; 38+ messages in thread
From: no-reply @ 2018-12-26 17:08 UTC (permalink / raw)
  To: liam.merwick
  Cc: fam, qemu-devel, ehabkost, mst, maran.wilson, george.kennedy,
	stefanha, xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare

Patchew URL: https://patchew.org/QEMU/1545422632-24444-1-git-send-email-liam.merwick@oracle.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 1545422632-24444-1-git-send-email-liam.merwick@oracle.com
Type: series
Subject: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
ae98c15 pvh: Boot uncompressed kernel using direct boot ABI
b99fe71 pvh: Add x86/HVM direct boot ABI header file
721dd87 elf-ops.h: Add get_elf_note_type()
4b86551 elf: Add optional function ptr to load_elf() to parse ELF notes

=== OUTPUT BEGIN ===
Checking PATCH 1/4: elf: Add optional function ptr to load_elf() to parse ELF notes...
Checking PATCH 2/4: elf-ops.h: Add get_elf_note_type()...
WARNING: Block comments use a leading /* on a separate line
#21: FILE: include/hw/elf_ops.h:268:
+/* Given 'nhdr', a pointer to a range of ELF Notes, search through them

WARNING: Block comments use a leading /* on a separate line
#49: FILE: include/hw/elf_ops.h:296:
+        /* If the offset calculated in this iteration exceeds the

ERROR: code indent should never use tabs
#50: FILE: include/hw/elf_ops.h:297:
+^I * supplied size, we are done and no matching note was found.$

ERROR: code indent should never use tabs
#51: FILE: include/hw/elf_ops.h:298:
+^I */$

ERROR: code indent should never use tabs
#71: FILE: include/hw/elf_ops.h:558:
+^I    /* Search the ELF notes to find one with a type matching the$

WARNING: Block comments use a leading /* on a separate line
#71: FILE: include/hw/elf_ops.h:558:
+           /* Search the ELF notes to find one with a type matching the

ERROR: code indent should never use tabs
#72: FILE: include/hw/elf_ops.h:559:
+^I     * value passed in via 'translate_opaque'$

ERROR: code indent should never use tabs
#73: FILE: include/hw/elf_ops.h:560:
+^I     */$

ERROR: code indent should never use tabs
#75: FILE: include/hw/elf_ops.h:562:
+^I    assert(translate_opaque != NULL);$

total: 6 errors, 3 warnings, 62 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 3/4: pvh: Add x86/HVM direct boot ABI header file...
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#22: 
new file mode 100644

WARNING: architecture specific defines should be avoided
#49: FILE: include/hw/xen/start_info.h:23:
+#ifndef __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__

total: 0 errors, 2 warnings, 146 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 4/4: pvh: Boot uncompressed kernel using direct boot ABI...
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/1545422632-24444-1-git-send-email-liam.merwick@oracle.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot
@ 2018-12-26 17:08   ` no-reply
  0 siblings, 0 replies; 38+ messages in thread
From: no-reply @ 2018-12-26 17:08 UTC (permalink / raw)
  To: liam.merwick
  Cc: fam, ehabkost, maran.wilson, mst, qemu-devel, george.kennedy,
	stefanha, pbonzini, xen-devel, boris.ostrovsky, sgarzare, rth

Patchew URL: https://patchew.org/QEMU/1545422632-24444-1-git-send-email-liam.merwick@oracle.com/



Hi,

This series seems to have some coding style problems. See output below for
more information:

Message-id: 1545422632-24444-1-git-send-email-liam.merwick@oracle.com
Type: series
Subject: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot

=== TEST SCRIPT BEGIN ===
#!/bin/bash

BASE=base
n=1
total=$(git log --oneline $BASE.. | wc -l)
failed=0

git config --local diff.renamelimit 0
git config --local diff.renames True
git config --local diff.algorithm histogram

commits="$(git log --format=%H --reverse $BASE..)"
for c in $commits; do
    echo "Checking PATCH $n/$total: $(git log -n 1 --format=%s $c)..."
    if ! git show $c --format=email | ./scripts/checkpatch.pl --mailback -; then
        failed=1
        echo
    fi
    n=$((n+1))
done

exit $failed
=== TEST SCRIPT END ===

Updating 3c8cf5a9c21ff8782164d1def7f44bd888713384
Switched to a new branch 'test'
ae98c15 pvh: Boot uncompressed kernel using direct boot ABI
b99fe71 pvh: Add x86/HVM direct boot ABI header file
721dd87 elf-ops.h: Add get_elf_note_type()
4b86551 elf: Add optional function ptr to load_elf() to parse ELF notes

=== OUTPUT BEGIN ===
Checking PATCH 1/4: elf: Add optional function ptr to load_elf() to parse ELF notes...
Checking PATCH 2/4: elf-ops.h: Add get_elf_note_type()...
WARNING: Block comments use a leading /* on a separate line
#21: FILE: include/hw/elf_ops.h:268:
+/* Given 'nhdr', a pointer to a range of ELF Notes, search through them

WARNING: Block comments use a leading /* on a separate line
#49: FILE: include/hw/elf_ops.h:296:
+        /* If the offset calculated in this iteration exceeds the

ERROR: code indent should never use tabs
#50: FILE: include/hw/elf_ops.h:297:
+^I * supplied size, we are done and no matching note was found.$

ERROR: code indent should never use tabs
#51: FILE: include/hw/elf_ops.h:298:
+^I */$

ERROR: code indent should never use tabs
#71: FILE: include/hw/elf_ops.h:558:
+^I    /* Search the ELF notes to find one with a type matching the$

WARNING: Block comments use a leading /* on a separate line
#71: FILE: include/hw/elf_ops.h:558:
+           /* Search the ELF notes to find one with a type matching the

ERROR: code indent should never use tabs
#72: FILE: include/hw/elf_ops.h:559:
+^I     * value passed in via 'translate_opaque'$

ERROR: code indent should never use tabs
#73: FILE: include/hw/elf_ops.h:560:
+^I     */$

ERROR: code indent should never use tabs
#75: FILE: include/hw/elf_ops.h:562:
+^I    assert(translate_opaque != NULL);$

total: 6 errors, 3 warnings, 62 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.

Checking PATCH 3/4: pvh: Add x86/HVM direct boot ABI header file...
WARNING: added, moved or deleted file(s), does MAINTAINERS need updating?
#22: 
new file mode 100644

WARNING: architecture specific defines should be avoided
#49: FILE: include/hw/xen/start_info.h:23:
+#ifndef __XEN_PUBLIC_ARCH_X86_HVM_START_INFO_H__

total: 0 errors, 2 warnings, 146 lines checked

Your patch has style problems, please review.  If any of these errors
are false positives report them to the maintainer, see
CHECKPATCH in MAINTAINERS.
Checking PATCH 4/4: pvh: Boot uncompressed kernel using direct boot ABI...
=== OUTPUT END ===

Test command exited with code: 1


The full log is available at
http://patchew.org/logs/1545422632-24444-1-git-send-email-liam.merwick@oracle.com/testing.checkpatch/?type=message.
---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 1/4] elf: Add optional function ptr to load_elf() to parse ELF notes
  2018-12-21 20:03   ` Liam Merwick
  (?)
  (?)
@ 2019-01-02 13:06   ` Stefan Hajnoczi
  2019-01-08 14:47       ` Liam Merwick
  -1 siblings, 1 reply; 38+ messages in thread
From: Stefan Hajnoczi @ 2019-01-02 13:06 UTC (permalink / raw)
  To: Liam Merwick
  Cc: qemu-devel, pbonzini, ehabkost, rth, xen-devel, sgarzare, mst,
	maran.wilson, george.kennedy, boris.ostrovsky

[-- Attachment #1: Type: text/plain, Size: 1670 bytes --]

On Fri, Dec 21, 2018 at 08:03:49PM +0000, Liam Merwick wrote:
> diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
> index 74679ff8da3a..37d20a3800c1 100644
> --- a/include/hw/elf_ops.h
> +++ b/include/hw/elf_ops.h
> @@ -266,6 +266,7 @@ fail:
>  }
>  
>  static int glue(load_elf, SZ)(const char *name, int fd,
> +                              uint64_t (*elf_note_fn)(void *, void *, bool),
>                                uint64_t (*translate_fn)(void *, uint64_t),
>                                void *translate_opaque,
>                                int must_swab, uint64_t *pentry,
> @@ -496,8 +497,30 @@ static int glue(load_elf, SZ)(const char *name, int fd,
>                  high = addr + mem_size;
>  
>              data = NULL;
> +
> +        } else if (ph->p_type == PT_NOTE && elf_note_fn) {
> +            struct elf_note *nhdr = NULL;
> +
> +            file_size = ph->p_filesz; /* Size of the range of ELF notes */
> +            data = g_malloc0(file_size);
> +            if (ph->p_filesz > 0) {
> +                if (lseek(fd, ph->p_offset, SEEK_SET) < 0) {
> +                    goto fail;
> +                }
> +                if (read(fd, data, file_size) != file_size) {
> +                    goto fail;
> +                }
> +            }
> +
> +            if (nhdr != NULL) {
> +                bool is64 =
> +                    sizeof(struct elf_note) == sizeof(struct elf64_note);
> +                elf_note_fn((void *)nhdr, (void *)&ph->p_align, is64);

How does data get used?

> +            }
> +            g_free(data);

Missing data = NULL to prevent double free later?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 1/4] elf: Add optional function ptr to load_elf() to parse ELF notes
  2018-12-21 20:03   ` Liam Merwick
  (?)
@ 2019-01-02 13:06   ` Stefan Hajnoczi
  -1 siblings, 0 replies; 38+ messages in thread
From: Stefan Hajnoczi @ 2019-01-02 13:06 UTC (permalink / raw)
  To: Liam Merwick
  Cc: ehabkost, mst, maran.wilson, qemu-devel, george.kennedy,
	xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare


[-- Attachment #1.1: Type: text/plain, Size: 1670 bytes --]

On Fri, Dec 21, 2018 at 08:03:49PM +0000, Liam Merwick wrote:
> diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
> index 74679ff8da3a..37d20a3800c1 100644
> --- a/include/hw/elf_ops.h
> +++ b/include/hw/elf_ops.h
> @@ -266,6 +266,7 @@ fail:
>  }
>  
>  static int glue(load_elf, SZ)(const char *name, int fd,
> +                              uint64_t (*elf_note_fn)(void *, void *, bool),
>                                uint64_t (*translate_fn)(void *, uint64_t),
>                                void *translate_opaque,
>                                int must_swab, uint64_t *pentry,
> @@ -496,8 +497,30 @@ static int glue(load_elf, SZ)(const char *name, int fd,
>                  high = addr + mem_size;
>  
>              data = NULL;
> +
> +        } else if (ph->p_type == PT_NOTE && elf_note_fn) {
> +            struct elf_note *nhdr = NULL;
> +
> +            file_size = ph->p_filesz; /* Size of the range of ELF notes */
> +            data = g_malloc0(file_size);
> +            if (ph->p_filesz > 0) {
> +                if (lseek(fd, ph->p_offset, SEEK_SET) < 0) {
> +                    goto fail;
> +                }
> +                if (read(fd, data, file_size) != file_size) {
> +                    goto fail;
> +                }
> +            }
> +
> +            if (nhdr != NULL) {
> +                bool is64 =
> +                    sizeof(struct elf_note) == sizeof(struct elf64_note);
> +                elf_note_fn((void *)nhdr, (void *)&ph->p_align, is64);

How does data get used?

> +            }
> +            g_free(data);

Missing data = NULL to prevent double free later?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 2/4] elf-ops.h: Add get_elf_note_type()
  2018-12-21 20:03   ` Liam Merwick
@ 2019-01-02 13:12     ` Stefan Hajnoczi
  -1 siblings, 0 replies; 38+ messages in thread
From: Stefan Hajnoczi @ 2019-01-02 13:12 UTC (permalink / raw)
  To: Liam Merwick
  Cc: qemu-devel, pbonzini, ehabkost, rth, xen-devel, sgarzare, mst,
	maran.wilson, george.kennedy, boris.ostrovsky

[-- Attachment #1: Type: text/plain, Size: 1764 bytes --]

On Fri, Dec 21, 2018 at 08:03:50PM +0000, Liam Merwick wrote:
> +    while (note_type != elf_note_type) {
> +        nhdr_namesz = nhdr->n_namesz;
> +        nhdr_descsz = nhdr->n_descsz;
> +
> +        elf_note_entry_offset = nhdr_size +
> +            QEMU_ALIGN_UP(nhdr_namesz, phdr_align) +
> +            QEMU_ALIGN_UP(nhdr_descsz, phdr_align);
> +
> +        /* If the offset calculated in this iteration exceeds the
> +	 * supplied size, we are done and no matching note was found.
> +	 */

Indentation is off here.  QEMU uses 4-space indentation.

> +        if (elf_note_entry_offset > note_size) {
> +            return NULL;
> +        }
> +
> +        /* skip to the next ELF Note entry */
> +        nhdr = (void *)nhdr + elf_note_entry_offset;
> +        note_type = nhdr->n_type;
> +    }
> +
> +    return nhdr;
> +}
> +
>  static int glue(load_elf, SZ)(const char *name, int fd,
>                                uint64_t (*elf_note_fn)(void *, void *, bool),
>                                uint64_t (*translate_fn)(void *, uint64_t),
> @@ -512,6 +555,13 @@ static int glue(load_elf, SZ)(const char *name, int fd,
>                  }
>              }
>  
> +	    /* Search the ELF notes to find one with a type matching the
> +	     * value passed in via 'translate_opaque'
> +	     */
> +            nhdr = (struct elf_note *)data;

Ah, I see data gets used here!  It would be clearer to move loading of
data into this patch.

> +	    assert(translate_opaque != NULL);
> +            nhdr = glue(get_elf_note_type, SZ)(nhdr, file_size, ph->p_align,
> +                                               *(uint64_t *)translate_opaque);

Indentation is off in this hunk.  QEMU uses 4-space indentation.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 2/4] elf-ops.h: Add get_elf_note_type()
@ 2019-01-02 13:12     ` Stefan Hajnoczi
  0 siblings, 0 replies; 38+ messages in thread
From: Stefan Hajnoczi @ 2019-01-02 13:12 UTC (permalink / raw)
  To: Liam Merwick
  Cc: ehabkost, mst, maran.wilson, qemu-devel, george.kennedy,
	xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare


[-- Attachment #1.1: Type: text/plain, Size: 1764 bytes --]

On Fri, Dec 21, 2018 at 08:03:50PM +0000, Liam Merwick wrote:
> +    while (note_type != elf_note_type) {
> +        nhdr_namesz = nhdr->n_namesz;
> +        nhdr_descsz = nhdr->n_descsz;
> +
> +        elf_note_entry_offset = nhdr_size +
> +            QEMU_ALIGN_UP(nhdr_namesz, phdr_align) +
> +            QEMU_ALIGN_UP(nhdr_descsz, phdr_align);
> +
> +        /* If the offset calculated in this iteration exceeds the
> +	 * supplied size, we are done and no matching note was found.
> +	 */

Indentation is off here.  QEMU uses 4-space indentation.

> +        if (elf_note_entry_offset > note_size) {
> +            return NULL;
> +        }
> +
> +        /* skip to the next ELF Note entry */
> +        nhdr = (void *)nhdr + elf_note_entry_offset;
> +        note_type = nhdr->n_type;
> +    }
> +
> +    return nhdr;
> +}
> +
>  static int glue(load_elf, SZ)(const char *name, int fd,
>                                uint64_t (*elf_note_fn)(void *, void *, bool),
>                                uint64_t (*translate_fn)(void *, uint64_t),
> @@ -512,6 +555,13 @@ static int glue(load_elf, SZ)(const char *name, int fd,
>                  }
>              }
>  
> +	    /* Search the ELF notes to find one with a type matching the
> +	     * value passed in via 'translate_opaque'
> +	     */
> +            nhdr = (struct elf_note *)data;

Ah, I see data gets used here!  It would be clearer to move loading of
data into this patch.

> +	    assert(translate_opaque != NULL);
> +            nhdr = glue(get_elf_note_type, SZ)(nhdr, file_size, ph->p_align,
> +                                               *(uint64_t *)translate_opaque);

Indentation is off in this hunk.  QEMU uses 4-space indentation.

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/4] pvh: Boot uncompressed kernel using direct boot ABI
  2018-12-21 20:03   ` Liam Merwick
@ 2019-01-02 13:18     ` Stefan Hajnoczi
  -1 siblings, 0 replies; 38+ messages in thread
From: Stefan Hajnoczi @ 2019-01-02 13:18 UTC (permalink / raw)
  To: Liam Merwick
  Cc: qemu-devel, pbonzini, ehabkost, rth, xen-devel, sgarzare, mst,
	maran.wilson, george.kennedy, boris.ostrovsky

[-- Attachment #1: Type: text/plain, Size: 344 bytes --]

On Fri, Dec 21, 2018 at 08:03:52PM +0000, Liam Merwick wrote:
> @@ -1336,7 +1470,7 @@ void pc_memory_init(PCMachineState *pcms,
>      int linux_boot, i;
>      MemoryRegion *ram, *option_rom_mr;
>      MemoryRegion *ram_below_4g, *ram_above_4g;
> -    FWCfgState *fw_cfg;
> +    FWCfgState *fw_cfg = NULL;

What is the purpose of this change?

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/4] pvh: Boot uncompressed kernel using direct boot ABI
@ 2019-01-02 13:18     ` Stefan Hajnoczi
  0 siblings, 0 replies; 38+ messages in thread
From: Stefan Hajnoczi @ 2019-01-02 13:18 UTC (permalink / raw)
  To: Liam Merwick
  Cc: ehabkost, mst, maran.wilson, qemu-devel, george.kennedy,
	xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare


[-- Attachment #1.1: Type: text/plain, Size: 344 bytes --]

On Fri, Dec 21, 2018 at 08:03:52PM +0000, Liam Merwick wrote:
> @@ -1336,7 +1470,7 @@ void pc_memory_init(PCMachineState *pcms,
>      int linux_boot, i;
>      MemoryRegion *ram, *option_rom_mr;
>      MemoryRegion *ram_below_4g, *ram_above_4g;
> -    FWCfgState *fw_cfg;
> +    FWCfgState *fw_cfg = NULL;

What is the purpose of this change?

[-- Attachment #1.2: signature.asc --]
[-- Type: application/pgp-signature, Size: 455 bytes --]

[-- Attachment #2: Type: text/plain, Size: 157 bytes --]

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot
  2018-12-21 20:03 ` Liam Merwick
@ 2019-01-03 17:22   ` Stefano Garzarella
  -1 siblings, 0 replies; 38+ messages in thread
From: Stefano Garzarella @ 2019-01-03 17:22 UTC (permalink / raw)
  To: Liam Merwick, maran.wilson; +Cc: qemu-devel, xen-devel, Stefan Hajnoczi

Hi Liam, Hi Maran,
I'm writing the optionrom to do PVH boot also with SeaBIOS.
It is almost complete and I'm testing it, but I have some issue with
QEMU -initrd parameter.
(It works correctly without -initrd and using a kernel with all needed
modules compiled statically)

Linux boots correctly, but it is not able to find the ramdisk. (I have
the same behavior with qboot)
Looking at Linux, QEMU, and qboot patches, I understood that the first
module pointed by 'modlist_paddr' in the 'hvm_start_info' should be
used to pass the ramdisk address and size to the kernel, but I didn't
understand who load it in RAM. (I guess QEMU directly or the firmware
by fw_cfg interface)

Can you give me some suggestions?

Thanks,
Stefano

On Fri, Dec 21, 2018 at 9:07 PM Liam Merwick <liam.merwick@oracle.com> wrote:
>
> For certain applications it is desirable to rapidly boot a KVM virtual
> machine. In cases where legacy hardware and software support within the
> guest is not needed, QEMU should be able to boot directly into the
> uncompressed Linux kernel binary with minimal firmware involvement.
>
> There already exists an ABI to allow this for Xen PVH guests and the ABI
> is supported by Linux and FreeBSD:
>
>    https://xenbits.xen.org/docs/unstable/misc/pvh.html
>
> Details on the Linux changes (v9 staged for 4.21): https://lkml.org/lkml/2018/12/14/1330
> qboot pull request: https://github.com/bonzini/qboot/pull/17
>
> This patch series provides QEMU support to read the ELF header of an
> uncompressed kernel binary and get the 32-bit PVH kernel entry point
> from an ELF Note.  In load_linux() a call is made to load_elfboot()
> so see if the header matches that of an uncompressed kernel binary (ELF)
> and if so, loads the binary and determines the kernel entry address
> from an ELF Note in the binary.  Then qboot does futher initialisation
> of the guest (e820, etc.) and jumps to the kernel entry address and
> boots the guest.
>
> changes v1 -> v2
> - Based on feedback from Stefan Hajnoczi
> - The reading of the PVH entry point is now done in a single pass during
>   elf_load() which results in Patch2 in v1 being split into Patches 1&2 in v2
>   and considerably reworked.
> - Patch1 adds a new optional function pointer to parse the ELF note type
>   (the type is passed in via the existing translate_opaque arg - the
>   function already had 11 args so I didn't want to add more than one new arg).
> - Patch2 adds a function to elf_ops.h to find an ELF note
>   matching a specific type
> - Patch3 just has a line added to the commit message to state that the Xen
>   repo is the canonical location
> - Patch4 (that does the PVH boot) is mainly equivalent to Patch3 in v1
>   just minor load_elfboot() changes and the addition of a
>   read_pvh_start_addr() helper function for load_elf()
>
>
> Usіng the method/scripts documented by the NEMU team at
>
>    https://github.com/intel/nemu/wiki/Measuring-Boot-Latency
>    https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00200.html
>
> below are some timings measured (vmlinux and bzImage from the same build)
> Time to get to kernel start is almost halved (95ṁs -> 48ms)
>
> QEMU + qboot + vmlinux (PVH + 4.20-rc4)
>  qemu_init_end: 41.550521
>  fw_start: 41.667139 (+0.116618)
>  fw_do_boot: 47.448495 (+5.781356)
>  linux_startup_64: 47.720785 (+0.27229)
>  linux_start_kernel: 48.399541 (+0.678756)
>  linux_start_user: 296.952056 (+248.552515)
>
> QEMU + qboot + bzImage:
>  qemu_init_end: 29.209276
>  fw_start: 29.317342 (+0.108066)
>  linux_start_boot: 36.679362 (+7.36202)
>  linux_startup_64: 94.531349 (+57.851987)
>  linux_start_kernel: 94.900913 (+0.369564)
>  linux_start_user: 401.060971 (+306.160058)
>
> QEMU + bzImage:
>  qemu_init_end: 30.424430
>  linux_startup_64: 893.770334 (+863.345904)
>  linux_start_kernel: 894.17049 (+0.400156)
>  linux_start_user: 1208.679768 (+314.509278)
>
>
> Liam Merwick (4):
>   elf: Add optional function ptr to load_elf() to parse ELF notes
>   elf-ops.h: Add get_elf_note_type()
>   pvh: Add x86/HVM direct boot ABI header file
>   pvh: Boot uncompressed kernel using direct boot ABI
>
>  hw/alpha/dp264.c               |   4 +-
>  hw/arm/armv7m.c                |   3 +-
>  hw/arm/boot.c                  |   2 +-
>  hw/core/generic-loader.c       |   2 +-
>  hw/core/loader.c               |  24 ++++---
>  hw/cris/boot.c                 |   3 +-
>  hw/hppa/machine.c              |   6 +-
>  hw/i386/multiboot.c            |   2 +-
>  hw/i386/pc.c                   | 131 +++++++++++++++++++++++++++++++++++-
>  hw/lm32/lm32_boards.c          |   6 +-
>  hw/lm32/milkymist.c            |   3 +-
>  hw/m68k/an5206.c               |   2 +-
>  hw/m68k/mcf5208.c              |   2 +-
>  hw/microblaze/boot.c           |   7 +-
>  hw/mips/mips_fulong2e.c        |   5 +-
>  hw/mips/mips_malta.c           |   5 +-
>  hw/mips/mips_mipssim.c         |   5 +-
>  hw/mips/mips_r4k.c             |   5 +-
>  hw/moxie/moxiesim.c            |   2 +-
>  hw/nios2/boot.c                |   7 +-
>  hw/openrisc/openrisc_sim.c     |   2 +-
>  hw/pci-host/prep.c             |   2 +-
>  hw/ppc/e500.c                  |   3 +-
>  hw/ppc/mac_newworld.c          |   5 +-
>  hw/ppc/mac_oldworld.c          |   5 +-
>  hw/ppc/ppc440_bamboo.c         |   2 +-
>  hw/ppc/sam460ex.c              |   3 +-
>  hw/ppc/spapr.c                 |   7 +-
>  hw/ppc/virtex_ml507.c          |   2 +-
>  hw/riscv/sifive_e.c            |   2 +-
>  hw/riscv/sifive_u.c            |   2 +-
>  hw/riscv/spike.c               |   2 +-
>  hw/riscv/virt.c                |   2 +-
>  hw/s390x/ipl.c                 |   9 ++-
>  hw/sparc/leon3.c               |   3 +-
>  hw/sparc/sun4m.c               |   6 +-
>  hw/sparc64/sun4u.c             |   4 +-
>  hw/tricore/tricore_testboard.c |   2 +-
>  hw/xtensa/sim.c                |  12 ++--
>  hw/xtensa/xtfpga.c             |   2 +-
>  include/elf.h                  |  10 +++
>  include/hw/elf_ops.h           |  72 ++++++++++++++++++++
>  include/hw/loader.h            |   9 ++-
>  include/hw/xen/start_info.h    | 146 +++++++++++++++++++++++++++++++++++++++++
>  44 files changed, 469 insertions(+), 71 deletions(-)
>  create mode 100644 include/hw/xen/start_info.h
>
> --
> 1.8.3.1
>

--
Stefano Garzarella
Red Hat

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/4] QEMU changes to do PVH boot
@ 2019-01-03 17:22   ` Stefano Garzarella
  0 siblings, 0 replies; 38+ messages in thread
From: Stefano Garzarella @ 2019-01-03 17:22 UTC (permalink / raw)
  To: Liam Merwick, maran.wilson; +Cc: xen-devel, qemu-devel, Stefan Hajnoczi

Hi Liam, Hi Maran,
I'm writing the optionrom to do PVH boot also with SeaBIOS.
It is almost complete and I'm testing it, but I have some issue with
QEMU -initrd parameter.
(It works correctly without -initrd and using a kernel with all needed
modules compiled statically)

Linux boots correctly, but it is not able to find the ramdisk. (I have
the same behavior with qboot)
Looking at Linux, QEMU, and qboot patches, I understood that the first
module pointed by 'modlist_paddr' in the 'hvm_start_info' should be
used to pass the ramdisk address and size to the kernel, but I didn't
understand who load it in RAM. (I guess QEMU directly or the firmware
by fw_cfg interface)

Can you give me some suggestions?

Thanks,
Stefano

On Fri, Dec 21, 2018 at 9:07 PM Liam Merwick <liam.merwick@oracle.com> wrote:
>
> For certain applications it is desirable to rapidly boot a KVM virtual
> machine. In cases where legacy hardware and software support within the
> guest is not needed, QEMU should be able to boot directly into the
> uncompressed Linux kernel binary with minimal firmware involvement.
>
> There already exists an ABI to allow this for Xen PVH guests and the ABI
> is supported by Linux and FreeBSD:
>
>    https://xenbits.xen.org/docs/unstable/misc/pvh.html
>
> Details on the Linux changes (v9 staged for 4.21): https://lkml.org/lkml/2018/12/14/1330
> qboot pull request: https://github.com/bonzini/qboot/pull/17
>
> This patch series provides QEMU support to read the ELF header of an
> uncompressed kernel binary and get the 32-bit PVH kernel entry point
> from an ELF Note.  In load_linux() a call is made to load_elfboot()
> so see if the header matches that of an uncompressed kernel binary (ELF)
> and if so, loads the binary and determines the kernel entry address
> from an ELF Note in the binary.  Then qboot does futher initialisation
> of the guest (e820, etc.) and jumps to the kernel entry address and
> boots the guest.
>
> changes v1 -> v2
> - Based on feedback from Stefan Hajnoczi
> - The reading of the PVH entry point is now done in a single pass during
>   elf_load() which results in Patch2 in v1 being split into Patches 1&2 in v2
>   and considerably reworked.
> - Patch1 adds a new optional function pointer to parse the ELF note type
>   (the type is passed in via the existing translate_opaque arg - the
>   function already had 11 args so I didn't want to add more than one new arg).
> - Patch2 adds a function to elf_ops.h to find an ELF note
>   matching a specific type
> - Patch3 just has a line added to the commit message to state that the Xen
>   repo is the canonical location
> - Patch4 (that does the PVH boot) is mainly equivalent to Patch3 in v1
>   just minor load_elfboot() changes and the addition of a
>   read_pvh_start_addr() helper function for load_elf()
>
>
> Usіng the method/scripts documented by the NEMU team at
>
>    https://github.com/intel/nemu/wiki/Measuring-Boot-Latency
>    https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00200.html
>
> below are some timings measured (vmlinux and bzImage from the same build)
> Time to get to kernel start is almost halved (95ṁs -> 48ms)
>
> QEMU + qboot + vmlinux (PVH + 4.20-rc4)
>  qemu_init_end: 41.550521
>  fw_start: 41.667139 (+0.116618)
>  fw_do_boot: 47.448495 (+5.781356)
>  linux_startup_64: 47.720785 (+0.27229)
>  linux_start_kernel: 48.399541 (+0.678756)
>  linux_start_user: 296.952056 (+248.552515)
>
> QEMU + qboot + bzImage:
>  qemu_init_end: 29.209276
>  fw_start: 29.317342 (+0.108066)
>  linux_start_boot: 36.679362 (+7.36202)
>  linux_startup_64: 94.531349 (+57.851987)
>  linux_start_kernel: 94.900913 (+0.369564)
>  linux_start_user: 401.060971 (+306.160058)
>
> QEMU + bzImage:
>  qemu_init_end: 30.424430
>  linux_startup_64: 893.770334 (+863.345904)
>  linux_start_kernel: 894.17049 (+0.400156)
>  linux_start_user: 1208.679768 (+314.509278)
>
>
> Liam Merwick (4):
>   elf: Add optional function ptr to load_elf() to parse ELF notes
>   elf-ops.h: Add get_elf_note_type()
>   pvh: Add x86/HVM direct boot ABI header file
>   pvh: Boot uncompressed kernel using direct boot ABI
>
>  hw/alpha/dp264.c               |   4 +-
>  hw/arm/armv7m.c                |   3 +-
>  hw/arm/boot.c                  |   2 +-
>  hw/core/generic-loader.c       |   2 +-
>  hw/core/loader.c               |  24 ++++---
>  hw/cris/boot.c                 |   3 +-
>  hw/hppa/machine.c              |   6 +-
>  hw/i386/multiboot.c            |   2 +-
>  hw/i386/pc.c                   | 131 +++++++++++++++++++++++++++++++++++-
>  hw/lm32/lm32_boards.c          |   6 +-
>  hw/lm32/milkymist.c            |   3 +-
>  hw/m68k/an5206.c               |   2 +-
>  hw/m68k/mcf5208.c              |   2 +-
>  hw/microblaze/boot.c           |   7 +-
>  hw/mips/mips_fulong2e.c        |   5 +-
>  hw/mips/mips_malta.c           |   5 +-
>  hw/mips/mips_mipssim.c         |   5 +-
>  hw/mips/mips_r4k.c             |   5 +-
>  hw/moxie/moxiesim.c            |   2 +-
>  hw/nios2/boot.c                |   7 +-
>  hw/openrisc/openrisc_sim.c     |   2 +-
>  hw/pci-host/prep.c             |   2 +-
>  hw/ppc/e500.c                  |   3 +-
>  hw/ppc/mac_newworld.c          |   5 +-
>  hw/ppc/mac_oldworld.c          |   5 +-
>  hw/ppc/ppc440_bamboo.c         |   2 +-
>  hw/ppc/sam460ex.c              |   3 +-
>  hw/ppc/spapr.c                 |   7 +-
>  hw/ppc/virtex_ml507.c          |   2 +-
>  hw/riscv/sifive_e.c            |   2 +-
>  hw/riscv/sifive_u.c            |   2 +-
>  hw/riscv/spike.c               |   2 +-
>  hw/riscv/virt.c                |   2 +-
>  hw/s390x/ipl.c                 |   9 ++-
>  hw/sparc/leon3.c               |   3 +-
>  hw/sparc/sun4m.c               |   6 +-
>  hw/sparc64/sun4u.c             |   4 +-
>  hw/tricore/tricore_testboard.c |   2 +-
>  hw/xtensa/sim.c                |  12 ++--
>  hw/xtensa/xtfpga.c             |   2 +-
>  include/elf.h                  |  10 +++
>  include/hw/elf_ops.h           |  72 ++++++++++++++++++++
>  include/hw/loader.h            |   9 ++-
>  include/hw/xen/start_info.h    | 146 +++++++++++++++++++++++++++++++++++++++++
>  44 files changed, 469 insertions(+), 71 deletions(-)
>  create mode 100644 include/hw/xen/start_info.h
>
> --
> 1.8.3.1
>

--
Stefano Garzarella
Red Hat

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot
  2019-01-03 17:22   ` Stefano Garzarella
@ 2019-01-08 14:47     ` Liam Merwick
  -1 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2019-01-08 14:47 UTC (permalink / raw)
  To: Stefano Garzarella, maran.wilson
  Cc: qemu-devel, xen-devel, Stefan Hajnoczi, George Kennedy, Boris Ostrovsky

Hi Stefano,

[ Catching up on mail after vacation ]

On 03/01/2019 17:22, Stefano Garzarella wrote:
> Hi Liam, Hi Maran,
> I'm writing the optionrom to do PVH boot also with SeaBIOS.
> It is almost complete and I'm testing it, but I have some issue with
> QEMU -initrd parameter.
> (It works correctly without -initrd and using a kernel with all needed
> modules compiled statically)
> 
> Linux boots correctly, but it is not able to find the ramdisk. (I have
> the same behavior with qboot)
> Looking at Linux, QEMU, and qboot patches, I understood that the first
> module pointed by 'modlist_paddr' in the 'hvm_start_info' should be
> used to pass the ramdisk address and size to the kernel, but I didn't
> understand who load it in RAM. (I guess QEMU directly or the firmware
> by fw_cfg interface)
> 
> Can you give me some suggestions?
> 


QEMU sets the hvm_modlist_entry in load_linux() after the call to 
load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()

But the current PVH patches don't handle initrd (they have 
start_info.nr_modules == 1).

During (or after) the call to load_elfboot() it looks like we'd need to 
do something like what load_multiboot() does below (along with the 
associated initialisation)

400     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, ADDR_MBI);
401     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, sizeof(bootinfo));
402     fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
403                      sizeof(bootinfo));

I'm checking to see if that has any implications for the kernel side.

Regards,
Liam

> 
> On Fri, Dec 21, 2018 at 9:07 PM Liam Merwick <liam.merwick@oracle.com> wrote:
>>
>> For certain applications it is desirable to rapidly boot a KVM virtual
>> machine. In cases where legacy hardware and software support within the
>> guest is not needed, QEMU should be able to boot directly into the
>> uncompressed Linux kernel binary with minimal firmware involvement.
>>
>> There already exists an ABI to allow this for Xen PVH guests and the ABI
>> is supported by Linux and FreeBSD:
>>
>>     https://xenbits.xen.org/docs/unstable/misc/pvh.html
>>
>> Details on the Linux changes (v9 staged for 4.21): https://lkml.org/lkml/2018/12/14/1330
>> qboot pull request: https://github.com/bonzini/qboot/pull/17
>>
>> This patch series provides QEMU support to read the ELF header of an
>> uncompressed kernel binary and get the 32-bit PVH kernel entry point
>> from an ELF Note.  In load_linux() a call is made to load_elfboot()
>> so see if the header matches that of an uncompressed kernel binary (ELF)
>> and if so, loads the binary and determines the kernel entry address
>> from an ELF Note in the binary.  Then qboot does futher initialisation
>> of the guest (e820, etc.) and jumps to the kernel entry address and
>> boots the guest.
>>
>> changes v1 -> v2
>> - Based on feedback from Stefan Hajnoczi
>> - The reading of the PVH entry point is now done in a single pass during
>>    elf_load() which results in Patch2 in v1 being split into Patches 1&2 in v2
>>    and considerably reworked.
>> - Patch1 adds a new optional function pointer to parse the ELF note type
>>    (the type is passed in via the existing translate_opaque arg - the
>>    function already had 11 args so I didn't want to add more than one new arg).
>> - Patch2 adds a function to elf_ops.h to find an ELF note
>>    matching a specific type
>> - Patch3 just has a line added to the commit message to state that the Xen
>>    repo is the canonical location
>> - Patch4 (that does the PVH boot) is mainly equivalent to Patch3 in v1
>>    just minor load_elfboot() changes and the addition of a
>>    read_pvh_start_addr() helper function for load_elf()
>>
>>
>> Usіng the method/scripts documented by the NEMU team at
>>
>>     https://github.com/intel/nemu/wiki/Measuring-Boot-Latency
>>     https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00200.html
>>
>> below are some timings measured (vmlinux and bzImage from the same build)
>> Time to get to kernel start is almost halved (95ṁs -> 48ms)
>>
>> QEMU + qboot + vmlinux (PVH + 4.20-rc4)
>>   qemu_init_end: 41.550521
>>   fw_start: 41.667139 (+0.116618)
>>   fw_do_boot: 47.448495 (+5.781356)
>>   linux_startup_64: 47.720785 (+0.27229)
>>   linux_start_kernel: 48.399541 (+0.678756)
>>   linux_start_user: 296.952056 (+248.552515)
>>
>> QEMU + qboot + bzImage:
>>   qemu_init_end: 29.209276
>>   fw_start: 29.317342 (+0.108066)
>>   linux_start_boot: 36.679362 (+7.36202)
>>   linux_startup_64: 94.531349 (+57.851987)
>>   linux_start_kernel: 94.900913 (+0.369564)
>>   linux_start_user: 401.060971 (+306.160058)
>>
>> QEMU + bzImage:
>>   qemu_init_end: 30.424430
>>   linux_startup_64: 893.770334 (+863.345904)
>>   linux_start_kernel: 894.17049 (+0.400156)
>>   linux_start_user: 1208.679768 (+314.509278)
>>
>>
>> Liam Merwick (4):
>>    elf: Add optional function ptr to load_elf() to parse ELF notes
>>    elf-ops.h: Add get_elf_note_type()
>>    pvh: Add x86/HVM direct boot ABI header file
>>    pvh: Boot uncompressed kernel using direct boot ABI
>>
>>   hw/alpha/dp264.c               |   4 +-
>>   hw/arm/armv7m.c                |   3 +-
>>   hw/arm/boot.c                  |   2 +-
>>   hw/core/generic-loader.c       |   2 +-
>>   hw/core/loader.c               |  24 ++++---
>>   hw/cris/boot.c                 |   3 +-
>>   hw/hppa/machine.c              |   6 +-
>>   hw/i386/multiboot.c            |   2 +-
>>   hw/i386/pc.c                   | 131 +++++++++++++++++++++++++++++++++++-
>>   hw/lm32/lm32_boards.c          |   6 +-
>>   hw/lm32/milkymist.c            |   3 +-
>>   hw/m68k/an5206.c               |   2 +-
>>   hw/m68k/mcf5208.c              |   2 +-
>>   hw/microblaze/boot.c           |   7 +-
>>   hw/mips/mips_fulong2e.c        |   5 +-
>>   hw/mips/mips_malta.c           |   5 +-
>>   hw/mips/mips_mipssim.c         |   5 +-
>>   hw/mips/mips_r4k.c             |   5 +-
>>   hw/moxie/moxiesim.c            |   2 +-
>>   hw/nios2/boot.c                |   7 +-
>>   hw/openrisc/openrisc_sim.c     |   2 +-
>>   hw/pci-host/prep.c             |   2 +-
>>   hw/ppc/e500.c                  |   3 +-
>>   hw/ppc/mac_newworld.c          |   5 +-
>>   hw/ppc/mac_oldworld.c          |   5 +-
>>   hw/ppc/ppc440_bamboo.c         |   2 +-
>>   hw/ppc/sam460ex.c              |   3 +-
>>   hw/ppc/spapr.c                 |   7 +-
>>   hw/ppc/virtex_ml507.c          |   2 +-
>>   hw/riscv/sifive_e.c            |   2 +-
>>   hw/riscv/sifive_u.c            |   2 +-
>>   hw/riscv/spike.c               |   2 +-
>>   hw/riscv/virt.c                |   2 +-
>>   hw/s390x/ipl.c                 |   9 ++-
>>   hw/sparc/leon3.c               |   3 +-
>>   hw/sparc/sun4m.c               |   6 +-
>>   hw/sparc64/sun4u.c             |   4 +-
>>   hw/tricore/tricore_testboard.c |   2 +-
>>   hw/xtensa/sim.c                |  12 ++--
>>   hw/xtensa/xtfpga.c             |   2 +-
>>   include/elf.h                  |  10 +++
>>   include/hw/elf_ops.h           |  72 ++++++++++++++++++++
>>   include/hw/loader.h            |   9 ++-
>>   include/hw/xen/start_info.h    | 146 +++++++++++++++++++++++++++++++++++++++++
>>   44 files changed, 469 insertions(+), 71 deletions(-)
>>   create mode 100644 include/hw/xen/start_info.h
>>
>> --
>> 1.8.3.1
>>
> 
> --
> Stefano Garzarella
> Red Hat
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/4] QEMU changes to do PVH boot
@ 2019-01-08 14:47     ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2019-01-08 14:47 UTC (permalink / raw)
  To: Stefano Garzarella, maran.wilson
  Cc: xen-devel, Boris Ostrovsky, qemu-devel, Stefan Hajnoczi, George Kennedy

Hi Stefano,

[ Catching up on mail after vacation ]

On 03/01/2019 17:22, Stefano Garzarella wrote:
> Hi Liam, Hi Maran,
> I'm writing the optionrom to do PVH boot also with SeaBIOS.
> It is almost complete and I'm testing it, but I have some issue with
> QEMU -initrd parameter.
> (It works correctly without -initrd and using a kernel with all needed
> modules compiled statically)
> 
> Linux boots correctly, but it is not able to find the ramdisk. (I have
> the same behavior with qboot)
> Looking at Linux, QEMU, and qboot patches, I understood that the first
> module pointed by 'modlist_paddr' in the 'hvm_start_info' should be
> used to pass the ramdisk address and size to the kernel, but I didn't
> understand who load it in RAM. (I guess QEMU directly or the firmware
> by fw_cfg interface)
> 
> Can you give me some suggestions?
> 


QEMU sets the hvm_modlist_entry in load_linux() after the call to 
load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()

But the current PVH patches don't handle initrd (they have 
start_info.nr_modules == 1).

During (or after) the call to load_elfboot() it looks like we'd need to 
do something like what load_multiboot() does below (along with the 
associated initialisation)

400     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, ADDR_MBI);
401     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, sizeof(bootinfo));
402     fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
403                      sizeof(bootinfo));

I'm checking to see if that has any implications for the kernel side.

Regards,
Liam

> 
> On Fri, Dec 21, 2018 at 9:07 PM Liam Merwick <liam.merwick@oracle.com> wrote:
>>
>> For certain applications it is desirable to rapidly boot a KVM virtual
>> machine. In cases where legacy hardware and software support within the
>> guest is not needed, QEMU should be able to boot directly into the
>> uncompressed Linux kernel binary with minimal firmware involvement.
>>
>> There already exists an ABI to allow this for Xen PVH guests and the ABI
>> is supported by Linux and FreeBSD:
>>
>>     https://xenbits.xen.org/docs/unstable/misc/pvh.html
>>
>> Details on the Linux changes (v9 staged for 4.21): https://lkml.org/lkml/2018/12/14/1330
>> qboot pull request: https://github.com/bonzini/qboot/pull/17
>>
>> This patch series provides QEMU support to read the ELF header of an
>> uncompressed kernel binary and get the 32-bit PVH kernel entry point
>> from an ELF Note.  In load_linux() a call is made to load_elfboot()
>> so see if the header matches that of an uncompressed kernel binary (ELF)
>> and if so, loads the binary and determines the kernel entry address
>> from an ELF Note in the binary.  Then qboot does futher initialisation
>> of the guest (e820, etc.) and jumps to the kernel entry address and
>> boots the guest.
>>
>> changes v1 -> v2
>> - Based on feedback from Stefan Hajnoczi
>> - The reading of the PVH entry point is now done in a single pass during
>>    elf_load() which results in Patch2 in v1 being split into Patches 1&2 in v2
>>    and considerably reworked.
>> - Patch1 adds a new optional function pointer to parse the ELF note type
>>    (the type is passed in via the existing translate_opaque arg - the
>>    function already had 11 args so I didn't want to add more than one new arg).
>> - Patch2 adds a function to elf_ops.h to find an ELF note
>>    matching a specific type
>> - Patch3 just has a line added to the commit message to state that the Xen
>>    repo is the canonical location
>> - Patch4 (that does the PVH boot) is mainly equivalent to Patch3 in v1
>>    just minor load_elfboot() changes and the addition of a
>>    read_pvh_start_addr() helper function for load_elf()
>>
>>
>> Usіng the method/scripts documented by the NEMU team at
>>
>>     https://github.com/intel/nemu/wiki/Measuring-Boot-Latency
>>     https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg00200.html
>>
>> below are some timings measured (vmlinux and bzImage from the same build)
>> Time to get to kernel start is almost halved (95ṁs -> 48ms)
>>
>> QEMU + qboot + vmlinux (PVH + 4.20-rc4)
>>   qemu_init_end: 41.550521
>>   fw_start: 41.667139 (+0.116618)
>>   fw_do_boot: 47.448495 (+5.781356)
>>   linux_startup_64: 47.720785 (+0.27229)
>>   linux_start_kernel: 48.399541 (+0.678756)
>>   linux_start_user: 296.952056 (+248.552515)
>>
>> QEMU + qboot + bzImage:
>>   qemu_init_end: 29.209276
>>   fw_start: 29.317342 (+0.108066)
>>   linux_start_boot: 36.679362 (+7.36202)
>>   linux_startup_64: 94.531349 (+57.851987)
>>   linux_start_kernel: 94.900913 (+0.369564)
>>   linux_start_user: 401.060971 (+306.160058)
>>
>> QEMU + bzImage:
>>   qemu_init_end: 30.424430
>>   linux_startup_64: 893.770334 (+863.345904)
>>   linux_start_kernel: 894.17049 (+0.400156)
>>   linux_start_user: 1208.679768 (+314.509278)
>>
>>
>> Liam Merwick (4):
>>    elf: Add optional function ptr to load_elf() to parse ELF notes
>>    elf-ops.h: Add get_elf_note_type()
>>    pvh: Add x86/HVM direct boot ABI header file
>>    pvh: Boot uncompressed kernel using direct boot ABI
>>
>>   hw/alpha/dp264.c               |   4 +-
>>   hw/arm/armv7m.c                |   3 +-
>>   hw/arm/boot.c                  |   2 +-
>>   hw/core/generic-loader.c       |   2 +-
>>   hw/core/loader.c               |  24 ++++---
>>   hw/cris/boot.c                 |   3 +-
>>   hw/hppa/machine.c              |   6 +-
>>   hw/i386/multiboot.c            |   2 +-
>>   hw/i386/pc.c                   | 131 +++++++++++++++++++++++++++++++++++-
>>   hw/lm32/lm32_boards.c          |   6 +-
>>   hw/lm32/milkymist.c            |   3 +-
>>   hw/m68k/an5206.c               |   2 +-
>>   hw/m68k/mcf5208.c              |   2 +-
>>   hw/microblaze/boot.c           |   7 +-
>>   hw/mips/mips_fulong2e.c        |   5 +-
>>   hw/mips/mips_malta.c           |   5 +-
>>   hw/mips/mips_mipssim.c         |   5 +-
>>   hw/mips/mips_r4k.c             |   5 +-
>>   hw/moxie/moxiesim.c            |   2 +-
>>   hw/nios2/boot.c                |   7 +-
>>   hw/openrisc/openrisc_sim.c     |   2 +-
>>   hw/pci-host/prep.c             |   2 +-
>>   hw/ppc/e500.c                  |   3 +-
>>   hw/ppc/mac_newworld.c          |   5 +-
>>   hw/ppc/mac_oldworld.c          |   5 +-
>>   hw/ppc/ppc440_bamboo.c         |   2 +-
>>   hw/ppc/sam460ex.c              |   3 +-
>>   hw/ppc/spapr.c                 |   7 +-
>>   hw/ppc/virtex_ml507.c          |   2 +-
>>   hw/riscv/sifive_e.c            |   2 +-
>>   hw/riscv/sifive_u.c            |   2 +-
>>   hw/riscv/spike.c               |   2 +-
>>   hw/riscv/virt.c                |   2 +-
>>   hw/s390x/ipl.c                 |   9 ++-
>>   hw/sparc/leon3.c               |   3 +-
>>   hw/sparc/sun4m.c               |   6 +-
>>   hw/sparc64/sun4u.c             |   4 +-
>>   hw/tricore/tricore_testboard.c |   2 +-
>>   hw/xtensa/sim.c                |  12 ++--
>>   hw/xtensa/xtfpga.c             |   2 +-
>>   include/elf.h                  |  10 +++
>>   include/hw/elf_ops.h           |  72 ++++++++++++++++++++
>>   include/hw/loader.h            |   9 ++-
>>   include/hw/xen/start_info.h    | 146 +++++++++++++++++++++++++++++++++++++++++
>>   44 files changed, 469 insertions(+), 71 deletions(-)
>>   create mode 100644 include/hw/xen/start_info.h
>>
>> --
>> 1.8.3.1
>>
> 
> --
> Stefano Garzarella
> Red Hat
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 1/4] elf: Add optional function ptr to load_elf() to parse ELF notes
  2019-01-02 13:06   ` [Qemu-devel] " Stefan Hajnoczi
@ 2019-01-08 14:47       ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2019-01-08 14:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, pbonzini, ehabkost, rth, xen-devel, sgarzare, mst,
	maran.wilson, george.kennedy, boris.ostrovsky



On 02/01/2019 13:06, Stefan Hajnoczi wrote:
> On Fri, Dec 21, 2018 at 08:03:49PM +0000, Liam Merwick wrote:
>> diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
>> index 74679ff8da3a..37d20a3800c1 100644
>> --- a/include/hw/elf_ops.h
>> +++ b/include/hw/elf_ops.h
>> @@ -266,6 +266,7 @@ fail:
>>   }
>>   
>>   static int glue(load_elf, SZ)(const char *name, int fd,
>> +                              uint64_t (*elf_note_fn)(void *, void *, bool),
>>                                 uint64_t (*translate_fn)(void *, uint64_t),
>>                                 void *translate_opaque,
>>                                 int must_swab, uint64_t *pentry,
>> @@ -496,8 +497,30 @@ static int glue(load_elf, SZ)(const char *name, int fd,
>>                   high = addr + mem_size;
>>   
>>               data = NULL;
>> +
>> +        } else if (ph->p_type == PT_NOTE && elf_note_fn) {
>> +            struct elf_note *nhdr = NULL;
>> +
>> +            file_size = ph->p_filesz; /* Size of the range of ELF notes */
>> +            data = g_malloc0(file_size);
>> +            if (ph->p_filesz > 0) {
>> +                if (lseek(fd, ph->p_offset, SEEK_SET) < 0) {
>> +                    goto fail;
>> +                }
>> +                if (read(fd, data, file_size) != file_size) {
>> +                    goto fail;
>> +                }
>> +            }
>> +
>> +            if (nhdr != NULL) {
>> +                bool is64 =
>> +                    sizeof(struct elf_note) == sizeof(struct elf64_note);
>> +                elf_note_fn((void *)nhdr, (void *)&ph->p_align, is64);
> 
> How does data get used?

Moved (as suggested in comments for next patch)

> 
>> +            }
>> +            g_free(data);
> 
> Missing data = NULL to prevent double free later?
> 

Added explicit assignment.

Regards,
Liam

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 1/4] elf: Add optional function ptr to load_elf() to parse ELF notes
@ 2019-01-08 14:47       ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2019-01-08 14:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: ehabkost, mst, maran.wilson, qemu-devel, george.kennedy,
	xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare



On 02/01/2019 13:06, Stefan Hajnoczi wrote:
> On Fri, Dec 21, 2018 at 08:03:49PM +0000, Liam Merwick wrote:
>> diff --git a/include/hw/elf_ops.h b/include/hw/elf_ops.h
>> index 74679ff8da3a..37d20a3800c1 100644
>> --- a/include/hw/elf_ops.h
>> +++ b/include/hw/elf_ops.h
>> @@ -266,6 +266,7 @@ fail:
>>   }
>>   
>>   static int glue(load_elf, SZ)(const char *name, int fd,
>> +                              uint64_t (*elf_note_fn)(void *, void *, bool),
>>                                 uint64_t (*translate_fn)(void *, uint64_t),
>>                                 void *translate_opaque,
>>                                 int must_swab, uint64_t *pentry,
>> @@ -496,8 +497,30 @@ static int glue(load_elf, SZ)(const char *name, int fd,
>>                   high = addr + mem_size;
>>   
>>               data = NULL;
>> +
>> +        } else if (ph->p_type == PT_NOTE && elf_note_fn) {
>> +            struct elf_note *nhdr = NULL;
>> +
>> +            file_size = ph->p_filesz; /* Size of the range of ELF notes */
>> +            data = g_malloc0(file_size);
>> +            if (ph->p_filesz > 0) {
>> +                if (lseek(fd, ph->p_offset, SEEK_SET) < 0) {
>> +                    goto fail;
>> +                }
>> +                if (read(fd, data, file_size) != file_size) {
>> +                    goto fail;
>> +                }
>> +            }
>> +
>> +            if (nhdr != NULL) {
>> +                bool is64 =
>> +                    sizeof(struct elf_note) == sizeof(struct elf64_note);
>> +                elf_note_fn((void *)nhdr, (void *)&ph->p_align, is64);
> 
> How does data get used?

Moved (as suggested in comments for next patch)

> 
>> +            }
>> +            g_free(data);
> 
> Missing data = NULL to prevent double free later?
> 

Added explicit assignment.

Regards,
Liam

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 2/4] elf-ops.h: Add get_elf_note_type()
  2019-01-02 13:12     ` Stefan Hajnoczi
@ 2019-01-08 14:47       ` Liam Merwick
  -1 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2019-01-08 14:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, pbonzini, ehabkost, rth, xen-devel, sgarzare, mst,
	maran.wilson, george.kennedy, boris.ostrovsky



On 02/01/2019 13:12, Stefan Hajnoczi wrote:
> On Fri, Dec 21, 2018 at 08:03:50PM +0000, Liam Merwick wrote:
>> +    while (note_type != elf_note_type) {
>> +        nhdr_namesz = nhdr->n_namesz;
>> +        nhdr_descsz = nhdr->n_descsz;
>> +
>> +        elf_note_entry_offset = nhdr_size +
>> +            QEMU_ALIGN_UP(nhdr_namesz, phdr_align) +
>> +            QEMU_ALIGN_UP(nhdr_descsz, phdr_align);
>> +
>> +        /* If the offset calculated in this iteration exceeds the
>> +	 * supplied size, we are done and no matching note was found.
>> +	 */
> 
> Indentation is off here.  QEMU uses 4-space indentation.
> 
>> +        if (elf_note_entry_offset > note_size) {
>> +            return NULL;
>> +        }
>> +
>> +        /* skip to the next ELF Note entry */
>> +        nhdr = (void *)nhdr + elf_note_entry_offset;
>> +        note_type = nhdr->n_type;
>> +    }
>> +
>> +    return nhdr;
>> +}
>> +
>>   static int glue(load_elf, SZ)(const char *name, int fd,
>>                                 uint64_t (*elf_note_fn)(void *, void *, bool),
>>                                 uint64_t (*translate_fn)(void *, uint64_t),
>> @@ -512,6 +555,13 @@ static int glue(load_elf, SZ)(const char *name, int fd,
>>                   }
>>               }
>>   
>> +	    /* Search the ELF notes to find one with a type matching the
>> +	     * value passed in via 'translate_opaque'
>> +	     */
>> +            nhdr = (struct elf_note *)data;
> 
> Ah, I see data gets used here!  It would be clearer to move loading of
> data into this patch.


Moved.

> 
>> +	    assert(translate_opaque != NULL);
>> +            nhdr = glue(get_elf_note_type, SZ)(nhdr, file_size, ph->p_align,
>> +                                               *(uint64_t *)translate_opaque);
> 
> Indentation is off in this hunk.  QEMU uses 4-space indentation.
> 

A few stray tabs had snuck in - I've fixed all those.

Regards,
Liam

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 2/4] elf-ops.h: Add get_elf_note_type()
@ 2019-01-08 14:47       ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2019-01-08 14:47 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: ehabkost, mst, maran.wilson, qemu-devel, george.kennedy,
	xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare



On 02/01/2019 13:12, Stefan Hajnoczi wrote:
> On Fri, Dec 21, 2018 at 08:03:50PM +0000, Liam Merwick wrote:
>> +    while (note_type != elf_note_type) {
>> +        nhdr_namesz = nhdr->n_namesz;
>> +        nhdr_descsz = nhdr->n_descsz;
>> +
>> +        elf_note_entry_offset = nhdr_size +
>> +            QEMU_ALIGN_UP(nhdr_namesz, phdr_align) +
>> +            QEMU_ALIGN_UP(nhdr_descsz, phdr_align);
>> +
>> +        /* If the offset calculated in this iteration exceeds the
>> +	 * supplied size, we are done and no matching note was found.
>> +	 */
> 
> Indentation is off here.  QEMU uses 4-space indentation.
> 
>> +        if (elf_note_entry_offset > note_size) {
>> +            return NULL;
>> +        }
>> +
>> +        /* skip to the next ELF Note entry */
>> +        nhdr = (void *)nhdr + elf_note_entry_offset;
>> +        note_type = nhdr->n_type;
>> +    }
>> +
>> +    return nhdr;
>> +}
>> +
>>   static int glue(load_elf, SZ)(const char *name, int fd,
>>                                 uint64_t (*elf_note_fn)(void *, void *, bool),
>>                                 uint64_t (*translate_fn)(void *, uint64_t),
>> @@ -512,6 +555,13 @@ static int glue(load_elf, SZ)(const char *name, int fd,
>>                   }
>>               }
>>   
>> +	    /* Search the ELF notes to find one with a type matching the
>> +	     * value passed in via 'translate_opaque'
>> +	     */
>> +            nhdr = (struct elf_note *)data;
> 
> Ah, I see data gets used here!  It would be clearer to move loading of
> data into this patch.


Moved.

> 
>> +	    assert(translate_opaque != NULL);
>> +            nhdr = glue(get_elf_note_type, SZ)(nhdr, file_size, ph->p_align,
>> +                                               *(uint64_t *)translate_opaque);
> 
> Indentation is off in this hunk.  QEMU uses 4-space indentation.
> 

A few stray tabs had snuck in - I've fixed all those.

Regards,
Liam

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 4/4] pvh: Boot uncompressed kernel using direct boot ABI
  2019-01-02 13:18     ` Stefan Hajnoczi
@ 2019-01-08 14:48       ` Liam Merwick
  -1 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2019-01-08 14:48 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: qemu-devel, pbonzini, ehabkost, rth, xen-devel, sgarzare, mst,
	maran.wilson, george.kennedy, boris.ostrovsky



On 02/01/2019 13:18, Stefan Hajnoczi wrote:
> On Fri, Dec 21, 2018 at 08:03:52PM +0000, Liam Merwick wrote:
>> @@ -1336,7 +1470,7 @@ void pc_memory_init(PCMachineState *pcms,
>>       int linux_boot, i;
>>       MemoryRegion *ram, *option_rom_mr;
>>       MemoryRegion *ram_below_4g, *ram_above_4g;
>> -    FWCfgState *fw_cfg;
>> +    FWCfgState *fw_cfg = NULL;
> 
> What is the purpose of this change?
> 

I've removed this. There is no need for it - it dated from when these 
changes used the Clear Containers -nofw patches.

Regards,
Liam

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 4/4] pvh: Boot uncompressed kernel using direct boot ABI
@ 2019-01-08 14:48       ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2019-01-08 14:48 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: ehabkost, mst, maran.wilson, qemu-devel, george.kennedy,
	xen-devel, pbonzini, boris.ostrovsky, rth, sgarzare



On 02/01/2019 13:18, Stefan Hajnoczi wrote:
> On Fri, Dec 21, 2018 at 08:03:52PM +0000, Liam Merwick wrote:
>> @@ -1336,7 +1470,7 @@ void pc_memory_init(PCMachineState *pcms,
>>       int linux_boot, i;
>>       MemoryRegion *ram, *option_rom_mr;
>>       MemoryRegion *ram_below_4g, *ram_above_4g;
>> -    FWCfgState *fw_cfg;
>> +    FWCfgState *fw_cfg = NULL;
> 
> What is the purpose of this change?
> 

I've removed this. There is no need for it - it dated from when these 
changes used the Clear Containers -nofw patches.

Regards,
Liam

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot
  2019-01-08 14:47     ` Liam Merwick
@ 2019-01-09 11:53       ` Stefano Garzarella
  -1 siblings, 0 replies; 38+ messages in thread
From: Stefano Garzarella @ 2019-01-09 11:53 UTC (permalink / raw)
  To: Liam Merwick
  Cc: maran.wilson, qemu-devel, xen-devel, Stefan Hajnoczi,
	George Kennedy, Boris Ostrovsky

Hi Liam,

On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick <liam.merwick@oracle.com> wrote:
> QEMU sets the hvm_modlist_entry in load_linux() after the call to
> load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
>
> But the current PVH patches don't handle initrd (they have
> start_info.nr_modules == 1).

Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
    /* The first module is always ramdisk. */
    if (pvh_start_info.nr_modules) {
        struct hvm_modlist_entry *modaddr =
            __va(pvh_start_info.modlist_paddr);
        pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
        pvh_bootparams.hdr.ramdisk_size = modaddr->size;
    }

So, putting start_info.nr_modules = 1, means that the first
hvm_modlist_entry should have the ramdisk paddr and size. Is it
correct?


>
> During (or after) the call to load_elfboot() it looks like we'd need to
> do something like what load_multiboot() does below (along with the
> associated initialisation)
>
> 400     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, ADDR_MBI);
> 401     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, sizeof(bootinfo));
> 402     fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
> 403                      sizeof(bootinfo));
>

In this case I think they used the FW_CFG_INITRD_* to pass bootinfo
varibales to the guest, maybe we could add something like what
linux_load() does:

    /* load initrd */
    if (initrd_filename) {
        ...
        initrd_addr = (initrd_max-initrd_size) & ~4095;

        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
        fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
        ...
    }

Then we can load the initrd in qboot or in the optionrom that I'm writing.

What do you think?

Thanks,
Stefano

> I'm checking to see if that has any implications for the kernel side.
>
> Regards,
> Liam
>


-- 
Stefano Garzarella
Red Hat

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/4] QEMU changes to do PVH boot
@ 2019-01-09 11:53       ` Stefano Garzarella
  0 siblings, 0 replies; 38+ messages in thread
From: Stefano Garzarella @ 2019-01-09 11:53 UTC (permalink / raw)
  To: Liam Merwick
  Cc: maran.wilson, qemu-devel, George Kennedy, Stefan Hajnoczi,
	xen-devel, Boris Ostrovsky

Hi Liam,

On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick <liam.merwick@oracle.com> wrote:
> QEMU sets the hvm_modlist_entry in load_linux() after the call to
> load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
>
> But the current PVH patches don't handle initrd (they have
> start_info.nr_modules == 1).

Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
    /* The first module is always ramdisk. */
    if (pvh_start_info.nr_modules) {
        struct hvm_modlist_entry *modaddr =
            __va(pvh_start_info.modlist_paddr);
        pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
        pvh_bootparams.hdr.ramdisk_size = modaddr->size;
    }

So, putting start_info.nr_modules = 1, means that the first
hvm_modlist_entry should have the ramdisk paddr and size. Is it
correct?


>
> During (or after) the call to load_elfboot() it looks like we'd need to
> do something like what load_multiboot() does below (along with the
> associated initialisation)
>
> 400     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, ADDR_MBI);
> 401     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, sizeof(bootinfo));
> 402     fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
> 403                      sizeof(bootinfo));
>

In this case I think they used the FW_CFG_INITRD_* to pass bootinfo
varibales to the guest, maybe we could add something like what
linux_load() does:

    /* load initrd */
    if (initrd_filename) {
        ...
        initrd_addr = (initrd_max-initrd_size) & ~4095;

        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
        fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
        fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
        ...
    }

Then we can load the initrd in qboot or in the optionrom that I'm writing.

What do you think?

Thanks,
Stefano

> I'm checking to see if that has any implications for the kernel side.
>
> Regards,
> Liam
>


-- 
Stefano Garzarella
Red Hat

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot
  2019-01-09 11:53       ` Stefano Garzarella
@ 2019-01-09 19:53         ` Boris Ostrovsky
  -1 siblings, 0 replies; 38+ messages in thread
From: Boris Ostrovsky @ 2019-01-09 19:53 UTC (permalink / raw)
  To: Stefano Garzarella, Liam Merwick
  Cc: maran.wilson, qemu-devel, xen-devel, Stefan Hajnoczi, George Kennedy

On 1/9/19 6:53 AM, Stefano Garzarella wrote:
> Hi Liam,
>
> On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick <liam.merwick@oracle.com> wrote:
>> QEMU sets the hvm_modlist_entry in load_linux() after the call to
>> load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
>>
>> But the current PVH patches don't handle initrd (they have
>> start_info.nr_modules == 1).
> Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
>     /* The first module is always ramdisk. */
>     if (pvh_start_info.nr_modules) {
>         struct hvm_modlist_entry *modaddr =
>             __va(pvh_start_info.modlist_paddr);
>         pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
>         pvh_bootparams.hdr.ramdisk_size = modaddr->size;
>     }
>
> So, putting start_info.nr_modules = 1, means that the first
> hvm_modlist_entry should have the ramdisk paddr and size. Is it
> correct?
>
>
>> During (or after) the call to load_elfboot() it looks like we'd need to
>> do something like what load_multiboot() does below (along with the
>> associated initialisation)
>>
>> 400     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, ADDR_MBI);
>> 401     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, sizeof(bootinfo));
>> 402     fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
>> 403                      sizeof(bootinfo));
>>
> In this case I think they used the FW_CFG_INITRD_* to pass bootinfo
> varibales to the guest, maybe we could add something like what
> linux_load() does:
>
>     /* load initrd */
>     if (initrd_filename) {
>         ...
>         initrd_addr = (initrd_max-initrd_size) & ~4095;
>
>         fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
>         fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
>         fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
>         ...
>     }
>
> Then we can load the initrd in qboot or in the optionrom that I'm writing.
>
> What do you think?


Why not specify this in pvh_start_info? This will be much faster for
everyone, no need to go through fw_cfg.

-boris

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/4] QEMU changes to do PVH boot
@ 2019-01-09 19:53         ` Boris Ostrovsky
  0 siblings, 0 replies; 38+ messages in thread
From: Boris Ostrovsky @ 2019-01-09 19:53 UTC (permalink / raw)
  To: Stefano Garzarella, Liam Merwick
  Cc: xen-devel, George Kennedy, qemu-devel, Stefan Hajnoczi, maran.wilson

On 1/9/19 6:53 AM, Stefano Garzarella wrote:
> Hi Liam,
>
> On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick <liam.merwick@oracle.com> wrote:
>> QEMU sets the hvm_modlist_entry in load_linux() after the call to
>> load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
>>
>> But the current PVH patches don't handle initrd (they have
>> start_info.nr_modules == 1).
> Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
>     /* The first module is always ramdisk. */
>     if (pvh_start_info.nr_modules) {
>         struct hvm_modlist_entry *modaddr =
>             __va(pvh_start_info.modlist_paddr);
>         pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
>         pvh_bootparams.hdr.ramdisk_size = modaddr->size;
>     }
>
> So, putting start_info.nr_modules = 1, means that the first
> hvm_modlist_entry should have the ramdisk paddr and size. Is it
> correct?
>
>
>> During (or after) the call to load_elfboot() it looks like we'd need to
>> do something like what load_multiboot() does below (along with the
>> associated initialisation)
>>
>> 400     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, ADDR_MBI);
>> 401     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, sizeof(bootinfo));
>> 402     fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
>> 403                      sizeof(bootinfo));
>>
> In this case I think they used the FW_CFG_INITRD_* to pass bootinfo
> varibales to the guest, maybe we could add something like what
> linux_load() does:
>
>     /* load initrd */
>     if (initrd_filename) {
>         ...
>         initrd_addr = (initrd_max-initrd_size) & ~4095;
>
>         fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
>         fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
>         fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
>         ...
>     }
>
> Then we can load the initrd in qboot or in the optionrom that I'm writing.
>
> What do you think?


Why not specify this in pvh_start_info? This will be much faster for
everyone, no need to go through fw_cfg.

-boris


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot
  2019-01-09 19:53         ` Boris Ostrovsky
@ 2019-01-09 21:18           ` Maran Wilson
  -1 siblings, 0 replies; 38+ messages in thread
From: Maran Wilson @ 2019-01-09 21:18 UTC (permalink / raw)
  To: Boris Ostrovsky, Stefano Garzarella, Liam Merwick
  Cc: qemu-devel, xen-devel, Stefan Hajnoczi, George Kennedy

On 1/9/2019 11:53 AM, Boris Ostrovsky wrote:
> On 1/9/19 6:53 AM, Stefano Garzarella wrote:
>> Hi Liam,
>>
>> On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick <liam.merwick@oracle.com> wrote:
>>> QEMU sets the hvm_modlist_entry in load_linux() after the call to
>>> load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
>>>
>>> But the current PVH patches don't handle initrd (they have
>>> start_info.nr_modules == 1).
>> Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
>>      /* The first module is always ramdisk. */
>>      if (pvh_start_info.nr_modules) {
>>          struct hvm_modlist_entry *modaddr =
>>              __va(pvh_start_info.modlist_paddr);
>>          pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
>>          pvh_bootparams.hdr.ramdisk_size = modaddr->size;
>>      }
>>
>> So, putting start_info.nr_modules = 1, means that the first
>> hvm_modlist_entry should have the ramdisk paddr and size. Is it
>> correct?

That's my understanding.

I think what's missing, is that we just need Qemu or qboot/seabios to 
properly populate the pvh_start_info.modlist_paddr with the address (as 
usable by the guest) of the hvm_modlist_entry which correctly defines 
the details of the initrd that has already been loaded into memory that 
is accessible by the guest.

-Maran

>>> During (or after) the call to load_elfboot() it looks like we'd need to
>>> do something like what load_multiboot() does below (along with the
>>> associated initialisation)
>>>
>>> 400     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, ADDR_MBI);
>>> 401     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, sizeof(bootinfo));
>>> 402     fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
>>> 403                      sizeof(bootinfo));
>>>
>> In this case I think they used the FW_CFG_INITRD_* to pass bootinfo
>> varibales to the guest, maybe we could add something like what
>> linux_load() does:
>>
>>      /* load initrd */
>>      if (initrd_filename) {
>>          ...
>>          initrd_addr = (initrd_max-initrd_size) & ~4095;
>>
>>          fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
>>          fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
>>          fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
>>          ...
>>      }
>>
>> Then we can load the initrd in qboot or in the optionrom that I'm writing.
>>
>> What do you think?
>
> Why not specify this in pvh_start_info? This will be much faster for
> everyone, no need to go through fw_cfg.
>
> -boris
>

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/4] QEMU changes to do PVH boot
@ 2019-01-09 21:18           ` Maran Wilson
  0 siblings, 0 replies; 38+ messages in thread
From: Maran Wilson @ 2019-01-09 21:18 UTC (permalink / raw)
  To: Boris Ostrovsky, Stefano Garzarella, Liam Merwick
  Cc: xen-devel, qemu-devel, Stefan Hajnoczi, George Kennedy

On 1/9/2019 11:53 AM, Boris Ostrovsky wrote:
> On 1/9/19 6:53 AM, Stefano Garzarella wrote:
>> Hi Liam,
>>
>> On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick <liam.merwick@oracle.com> wrote:
>>> QEMU sets the hvm_modlist_entry in load_linux() after the call to
>>> load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
>>>
>>> But the current PVH patches don't handle initrd (they have
>>> start_info.nr_modules == 1).
>> Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
>>      /* The first module is always ramdisk. */
>>      if (pvh_start_info.nr_modules) {
>>          struct hvm_modlist_entry *modaddr =
>>              __va(pvh_start_info.modlist_paddr);
>>          pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
>>          pvh_bootparams.hdr.ramdisk_size = modaddr->size;
>>      }
>>
>> So, putting start_info.nr_modules = 1, means that the first
>> hvm_modlist_entry should have the ramdisk paddr and size. Is it
>> correct?

That's my understanding.

I think what's missing, is that we just need Qemu or qboot/seabios to 
properly populate the pvh_start_info.modlist_paddr with the address (as 
usable by the guest) of the hvm_modlist_entry which correctly defines 
the details of the initrd that has already been loaded into memory that 
is accessible by the guest.

-Maran

>>> During (or after) the call to load_elfboot() it looks like we'd need to
>>> do something like what load_multiboot() does below (along with the
>>> associated initialisation)
>>>
>>> 400     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, ADDR_MBI);
>>> 401     fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, sizeof(bootinfo));
>>> 402     fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, mb_bootinfo_data,
>>> 403                      sizeof(bootinfo));
>>>
>> In this case I think they used the FW_CFG_INITRD_* to pass bootinfo
>> varibales to the guest, maybe we could add something like what
>> linux_load() does:
>>
>>      /* load initrd */
>>      if (initrd_filename) {
>>          ...
>>          initrd_addr = (initrd_max-initrd_size) & ~4095;
>>
>>          fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
>>          fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
>>          fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data, initrd_size);
>>          ...
>>      }
>>
>> Then we can load the initrd in qboot or in the optionrom that I'm writing.
>>
>> What do you think?
>
> Why not specify this in pvh_start_info? This will be much faster for
> everyone, no need to go through fw_cfg.
>
> -boris
>


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot
  2019-01-09 21:18           ` Maran Wilson
@ 2019-01-10 15:12             ` Stefano Garzarella
  -1 siblings, 0 replies; 38+ messages in thread
From: Stefano Garzarella @ 2019-01-10 15:12 UTC (permalink / raw)
  To: Maran Wilson
  Cc: Boris Ostrovsky, Liam Merwick, qemu-devel, xen-devel,
	Stefan Hajnoczi, George Kennedy

On Wed, Jan 09, 2019 at 01:18:12PM -0800, Maran Wilson wrote:
> On 1/9/2019 11:53 AM, Boris Ostrovsky wrote:
> > On 1/9/19 6:53 AM, Stefano Garzarella wrote:
> > > Hi Liam,
> > > 
> > > On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick <liam.merwick@oracle.com> wrote:
> > > > QEMU sets the hvm_modlist_entry in load_linux() after the call to
> > > > load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
> > > > 
> > > > But the current PVH patches don't handle initrd (they have
> > > > start_info.nr_modules == 1).
> > > Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
> > >      /* The first module is always ramdisk. */
> > >      if (pvh_start_info.nr_modules) {
> > >          struct hvm_modlist_entry *modaddr =
> > >              __va(pvh_start_info.modlist_paddr);
> > >          pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
> > >          pvh_bootparams.hdr.ramdisk_size = modaddr->size;
> > >      }
> > > 
> > > So, putting start_info.nr_modules = 1, means that the first
> > > hvm_modlist_entry should have the ramdisk paddr and size. Is it
> > > correct?
> 
> That's my understanding.
> 
> I think what's missing, is that we just need Qemu or qboot/seabios to
> properly populate the pvh_start_info.modlist_paddr with the address (as
> usable by the guest) of the hvm_modlist_entry which correctly defines the
> details of the initrd that has already been loaded into memory that is
> accessible by the guest.
> 
> -Maran
> 

I tried and it works, I modified QEMU to load the initrd and to expose it
through fw_cfg, then qboot loads it and set correctly the hvm_modlist_entry.

You can find the patch of QEMU at the end of this email and the qboot
patch here: https://github.com/stefano-garzarella/qboot/commit/41e1fd765c8419e270fd79d9b3af5d53576e88a8

Do you think can be a good approach? If you want, you can add this patch
to your series.

Thanks,
Stefano


>From d5c0d51768f5a8fb214be6c2bb0cb7e86e9917b7 Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Thu, 10 Jan 2019 15:16:44 +0100
Subject: [PATCH] pvh: load initrd and expose it through fw_cfg

When initrd is specified, load and expose it to the guest firmware
through fw_cfg. The firmware will fill the hvm_start_info for the
kernel.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Based-on: <1545422632-24444-5-git-send-email-liam.merwick@oracle.com>
---
 hw/i386/pc.c | 38 +++++++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 06bce6a101..f6721f51be 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -986,25 +986,45 @@ static void load_linux(PCMachineState *pcms,
          */
         if (load_elfboot(kernel_filename, kernel_size,
                          header, pvh_start_addr, fw_cfg)) {
-            struct hvm_modlist_entry ramdisk_mod = { 0 };
-
             fclose(f);
 
             fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
                 strlen(kernel_cmdline) + 1);
             fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
 
-            assert(machine->device_memory != NULL);
-            ramdisk_mod.paddr = machine->device_memory->base;
-            ramdisk_mod.size =
-                memory_region_size(&machine->device_memory->mr);
-
-            fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, &ramdisk_mod,
-                             sizeof(ramdisk_mod));
             fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
             fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
                              header, sizeof(header));
 
+            /* load initrd */
+            if (initrd_filename) {
+                gsize initrd_size;
+                gchar *initrd_data;
+                GError *gerr = NULL;
+
+                if (!g_file_get_contents(initrd_filename, &initrd_data,
+                            &initrd_size, &gerr)) {
+                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
+                            initrd_filename, gerr->message);
+                    exit(1);
+                }
+
+                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
+                if (initrd_size >= initrd_max) {
+                    fprintf(stderr, "qemu: initrd is too large, cannot support."
+                            "(max: %"PRIu32", need %"PRId64")\n",
+                            initrd_max, (uint64_t)initrd_size);
+                    exit(1);
+                }
+
+                initrd_addr = (initrd_max - initrd_size) & ~4095;
+
+                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
+                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
+                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
+                                 initrd_size);
+            }
+
             return;
         }
         /* This looks like a multiboot kernel. If it is, let's stop
-- 
2.20.1

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/4] QEMU changes to do PVH boot
@ 2019-01-10 15:12             ` Stefano Garzarella
  0 siblings, 0 replies; 38+ messages in thread
From: Stefano Garzarella @ 2019-01-10 15:12 UTC (permalink / raw)
  To: Maran Wilson
  Cc: qemu-devel, George Kennedy, Stefan Hajnoczi, xen-devel,
	Liam Merwick, Boris Ostrovsky

On Wed, Jan 09, 2019 at 01:18:12PM -0800, Maran Wilson wrote:
> On 1/9/2019 11:53 AM, Boris Ostrovsky wrote:
> > On 1/9/19 6:53 AM, Stefano Garzarella wrote:
> > > Hi Liam,
> > > 
> > > On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick <liam.merwick@oracle.com> wrote:
> > > > QEMU sets the hvm_modlist_entry in load_linux() after the call to
> > > > load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
> > > > 
> > > > But the current PVH patches don't handle initrd (they have
> > > > start_info.nr_modules == 1).
> > > Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
> > >      /* The first module is always ramdisk. */
> > >      if (pvh_start_info.nr_modules) {
> > >          struct hvm_modlist_entry *modaddr =
> > >              __va(pvh_start_info.modlist_paddr);
> > >          pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
> > >          pvh_bootparams.hdr.ramdisk_size = modaddr->size;
> > >      }
> > > 
> > > So, putting start_info.nr_modules = 1, means that the first
> > > hvm_modlist_entry should have the ramdisk paddr and size. Is it
> > > correct?
> 
> That's my understanding.
> 
> I think what's missing, is that we just need Qemu or qboot/seabios to
> properly populate the pvh_start_info.modlist_paddr with the address (as
> usable by the guest) of the hvm_modlist_entry which correctly defines the
> details of the initrd that has already been loaded into memory that is
> accessible by the guest.
> 
> -Maran
> 

I tried and it works, I modified QEMU to load the initrd and to expose it
through fw_cfg, then qboot loads it and set correctly the hvm_modlist_entry.

You can find the patch of QEMU at the end of this email and the qboot
patch here: https://github.com/stefano-garzarella/qboot/commit/41e1fd765c8419e270fd79d9b3af5d53576e88a8

Do you think can be a good approach? If you want, you can add this patch
to your series.

Thanks,
Stefano


From d5c0d51768f5a8fb214be6c2bb0cb7e86e9917b7 Mon Sep 17 00:00:00 2001
From: Stefano Garzarella <sgarzare@redhat.com>
Date: Thu, 10 Jan 2019 15:16:44 +0100
Subject: [PATCH] pvh: load initrd and expose it through fw_cfg

When initrd is specified, load and expose it to the guest firmware
through fw_cfg. The firmware will fill the hvm_start_info for the
kernel.

Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Based-on: <1545422632-24444-5-git-send-email-liam.merwick@oracle.com>
---
 hw/i386/pc.c | 38 +++++++++++++++++++++++++++++---------
 1 file changed, 29 insertions(+), 9 deletions(-)

diff --git a/hw/i386/pc.c b/hw/i386/pc.c
index 06bce6a101..f6721f51be 100644
--- a/hw/i386/pc.c
+++ b/hw/i386/pc.c
@@ -986,25 +986,45 @@ static void load_linux(PCMachineState *pcms,
          */
         if (load_elfboot(kernel_filename, kernel_size,
                          header, pvh_start_addr, fw_cfg)) {
-            struct hvm_modlist_entry ramdisk_mod = { 0 };
-
             fclose(f);
 
             fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
                 strlen(kernel_cmdline) + 1);
             fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
 
-            assert(machine->device_memory != NULL);
-            ramdisk_mod.paddr = machine->device_memory->base;
-            ramdisk_mod.size =
-                memory_region_size(&machine->device_memory->mr);
-
-            fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, &ramdisk_mod,
-                             sizeof(ramdisk_mod));
             fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
             fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
                              header, sizeof(header));
 
+            /* load initrd */
+            if (initrd_filename) {
+                gsize initrd_size;
+                gchar *initrd_data;
+                GError *gerr = NULL;
+
+                if (!g_file_get_contents(initrd_filename, &initrd_data,
+                            &initrd_size, &gerr)) {
+                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
+                            initrd_filename, gerr->message);
+                    exit(1);
+                }
+
+                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
+                if (initrd_size >= initrd_max) {
+                    fprintf(stderr, "qemu: initrd is too large, cannot support."
+                            "(max: %"PRIu32", need %"PRId64")\n",
+                            initrd_max, (uint64_t)initrd_size);
+                    exit(1);
+                }
+
+                initrd_addr = (initrd_max - initrd_size) & ~4095;
+
+                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
+                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
+                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
+                                 initrd_size);
+            }
+
             return;
         }
         /* This looks like a multiboot kernel. If it is, let's stop
-- 
2.20.1


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot
  2019-01-10 15:12             ` Stefano Garzarella
@ 2019-01-15  9:51               ` Liam Merwick
  -1 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2019-01-15  9:51 UTC (permalink / raw)
  To: Stefano Garzarella, Maran Wilson
  Cc: Boris Ostrovsky, qemu-devel, xen-devel, Stefan Hajnoczi, George Kennedy

Hi Stefano,

On 10/01/2019 15:12, Stefano Garzarella wrote:
> On Wed, Jan 09, 2019 at 01:18:12PM -0800, Maran Wilson wrote:
>> On 1/9/2019 11:53 AM, Boris Ostrovsky wrote:
>>> On 1/9/19 6:53 AM, Stefano Garzarella wrote:
>>>> Hi Liam,
>>>>
>>>> On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick <liam.merwick@oracle.com> wrote:
>>>>> QEMU sets the hvm_modlist_entry in load_linux() after the call to
>>>>> load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
>>>>>
>>>>> But the current PVH patches don't handle initrd (they have
>>>>> start_info.nr_modules == 1).
>>>> Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
>>>>       /* The first module is always ramdisk. */
>>>>       if (pvh_start_info.nr_modules) {
>>>>           struct hvm_modlist_entry *modaddr =
>>>>               __va(pvh_start_info.modlist_paddr);
>>>>           pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
>>>>           pvh_bootparams.hdr.ramdisk_size = modaddr->size;
>>>>       }
>>>>
>>>> So, putting start_info.nr_modules = 1, means that the first
>>>> hvm_modlist_entry should have the ramdisk paddr and size. Is it
>>>> correct?
>>
>> That's my understanding.
>>
>> I think what's missing, is that we just need Qemu or qboot/seabios to
>> properly populate the pvh_start_info.modlist_paddr with the address (as
>> usable by the guest) of the hvm_modlist_entry which correctly defines the
>> details of the initrd that has already been loaded into memory that is
>> accessible by the guest.
>>
>> -Maran
>>
> 
> I tried and it works, I modified QEMU to load the initrd and to expose it
> through fw_cfg, then qboot loads it and set correctly the hvm_modlist_entry.
> 
> You can find the patch of QEMU at the end of this email and the qboot
> patch here: https://github.com/stefano-garzarella/qboot/commit/41e1fd765c8419e270fd79d9b3af5d53576e88a8
> 
> Do you think can be a good approach? If you want, you can add this patch
> to your series.

Code looks good to me. I'll include it with v3 of my QEMU patches.

Regards,
Liam


> 
> Thanks,
> Stefano
> 
> 
>  From d5c0d51768f5a8fb214be6c2bb0cb7e86e9917b7 Mon Sep 17 00:00:00 2001
> From: Stefano Garzarella <sgarzare@redhat.com>
> Date: Thu, 10 Jan 2019 15:16:44 +0100
> Subject: [PATCH] pvh: load initrd and expose it through fw_cfg
> 
> When initrd is specified, load and expose it to the guest firmware
> through fw_cfg. The firmware will fill the hvm_start_info for the
> kernel.
> 
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> Based-on: <1545422632-24444-5-git-send-email-liam.merwick@oracle.com>
> ---
>   hw/i386/pc.c | 38 +++++++++++++++++++++++++++++---------
>   1 file changed, 29 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 06bce6a101..f6721f51be 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -986,25 +986,45 @@ static void load_linux(PCMachineState *pcms,
>            */
>           if (load_elfboot(kernel_filename, kernel_size,
>                            header, pvh_start_addr, fw_cfg)) {
> -            struct hvm_modlist_entry ramdisk_mod = { 0 };
> -
>               fclose(f);
>   
>               fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
>                   strlen(kernel_cmdline) + 1);
>               fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
>   
> -            assert(machine->device_memory != NULL);
> -            ramdisk_mod.paddr = machine->device_memory->base;
> -            ramdisk_mod.size =
> -                memory_region_size(&machine->device_memory->mr);
> -
> -            fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, &ramdisk_mod,
> -                             sizeof(ramdisk_mod));
>               fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
>               fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
>                                header, sizeof(header));
>   
> +            /* load initrd */
> +            if (initrd_filename) {
> +                gsize initrd_size;
> +                gchar *initrd_data;
> +                GError *gerr = NULL;
> +
> +                if (!g_file_get_contents(initrd_filename, &initrd_data,
> +                            &initrd_size, &gerr)) {
> +                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> +                            initrd_filename, gerr->message);
> +                    exit(1);
> +                }
> +
> +                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> +                if (initrd_size >= initrd_max) {
> +                    fprintf(stderr, "qemu: initrd is too large, cannot support."
> +                            "(max: %"PRIu32", need %"PRId64")\n",
> +                            initrd_max, (uint64_t)initrd_size);
> +                    exit(1);
> +                }
> +
> +                initrd_addr = (initrd_max - initrd_size) & ~4095;
> +
> +                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> +                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> +                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
> +                                 initrd_size);
> +            }
> +
>               return;
>           }
>           /* This looks like a multiboot kernel. If it is, let's stop
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC v2 0/4] QEMU changes to do PVH boot
@ 2019-01-15  9:51               ` Liam Merwick
  0 siblings, 0 replies; 38+ messages in thread
From: Liam Merwick @ 2019-01-15  9:51 UTC (permalink / raw)
  To: Stefano Garzarella, Maran Wilson
  Cc: xen-devel, Boris Ostrovsky, qemu-devel, Stefan Hajnoczi, George Kennedy

Hi Stefano,

On 10/01/2019 15:12, Stefano Garzarella wrote:
> On Wed, Jan 09, 2019 at 01:18:12PM -0800, Maran Wilson wrote:
>> On 1/9/2019 11:53 AM, Boris Ostrovsky wrote:
>>> On 1/9/19 6:53 AM, Stefano Garzarella wrote:
>>>> Hi Liam,
>>>>
>>>> On Tue, Jan 8, 2019 at 3:47 PM Liam Merwick <liam.merwick@oracle.com> wrote:
>>>>> QEMU sets the hvm_modlist_entry in load_linux() after the call to
>>>>> load_elfboot() and then qboot loads it in boot_pvh_from_fw_cfg()
>>>>>
>>>>> But the current PVH patches don't handle initrd (they have
>>>>> start_info.nr_modules == 1).
>>>> Looking in the linux kernel (arch/x86/platform/pvh/enlighten.c) I saw:
>>>>       /* The first module is always ramdisk. */
>>>>       if (pvh_start_info.nr_modules) {
>>>>           struct hvm_modlist_entry *modaddr =
>>>>               __va(pvh_start_info.modlist_paddr);
>>>>           pvh_bootparams.hdr.ramdisk_image = modaddr->paddr;
>>>>           pvh_bootparams.hdr.ramdisk_size = modaddr->size;
>>>>       }
>>>>
>>>> So, putting start_info.nr_modules = 1, means that the first
>>>> hvm_modlist_entry should have the ramdisk paddr and size. Is it
>>>> correct?
>>
>> That's my understanding.
>>
>> I think what's missing, is that we just need Qemu or qboot/seabios to
>> properly populate the pvh_start_info.modlist_paddr with the address (as
>> usable by the guest) of the hvm_modlist_entry which correctly defines the
>> details of the initrd that has already been loaded into memory that is
>> accessible by the guest.
>>
>> -Maran
>>
> 
> I tried and it works, I modified QEMU to load the initrd and to expose it
> through fw_cfg, then qboot loads it and set correctly the hvm_modlist_entry.
> 
> You can find the patch of QEMU at the end of this email and the qboot
> patch here: https://github.com/stefano-garzarella/qboot/commit/41e1fd765c8419e270fd79d9b3af5d53576e88a8
> 
> Do you think can be a good approach? If you want, you can add this patch
> to your series.

Code looks good to me. I'll include it with v3 of my QEMU patches.

Regards,
Liam


> 
> Thanks,
> Stefano
> 
> 
>  From d5c0d51768f5a8fb214be6c2bb0cb7e86e9917b7 Mon Sep 17 00:00:00 2001
> From: Stefano Garzarella <sgarzare@redhat.com>
> Date: Thu, 10 Jan 2019 15:16:44 +0100
> Subject: [PATCH] pvh: load initrd and expose it through fw_cfg
> 
> When initrd is specified, load and expose it to the guest firmware
> through fw_cfg. The firmware will fill the hvm_start_info for the
> kernel.
> 
> Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
> Based-on: <1545422632-24444-5-git-send-email-liam.merwick@oracle.com>
> ---
>   hw/i386/pc.c | 38 +++++++++++++++++++++++++++++---------
>   1 file changed, 29 insertions(+), 9 deletions(-)
> 
> diff --git a/hw/i386/pc.c b/hw/i386/pc.c
> index 06bce6a101..f6721f51be 100644
> --- a/hw/i386/pc.c
> +++ b/hw/i386/pc.c
> @@ -986,25 +986,45 @@ static void load_linux(PCMachineState *pcms,
>            */
>           if (load_elfboot(kernel_filename, kernel_size,
>                            header, pvh_start_addr, fw_cfg)) {
> -            struct hvm_modlist_entry ramdisk_mod = { 0 };
> -
>               fclose(f);
>   
>               fw_cfg_add_i32(fw_cfg, FW_CFG_CMDLINE_SIZE,
>                   strlen(kernel_cmdline) + 1);
>               fw_cfg_add_string(fw_cfg, FW_CFG_CMDLINE_DATA, kernel_cmdline);
>   
> -            assert(machine->device_memory != NULL);
> -            ramdisk_mod.paddr = machine->device_memory->base;
> -            ramdisk_mod.size =
> -                memory_region_size(&machine->device_memory->mr);
> -
> -            fw_cfg_add_bytes(fw_cfg, FW_CFG_KERNEL_DATA, &ramdisk_mod,
> -                             sizeof(ramdisk_mod));
>               fw_cfg_add_i32(fw_cfg, FW_CFG_SETUP_SIZE, sizeof(header));
>               fw_cfg_add_bytes(fw_cfg, FW_CFG_SETUP_DATA,
>                                header, sizeof(header));
>   
> +            /* load initrd */
> +            if (initrd_filename) {
> +                gsize initrd_size;
> +                gchar *initrd_data;
> +                GError *gerr = NULL;
> +
> +                if (!g_file_get_contents(initrd_filename, &initrd_data,
> +                            &initrd_size, &gerr)) {
> +                    fprintf(stderr, "qemu: error reading initrd %s: %s\n",
> +                            initrd_filename, gerr->message);
> +                    exit(1);
> +                }
> +
> +                initrd_max = pcms->below_4g_mem_size - pcmc->acpi_data_size - 1;
> +                if (initrd_size >= initrd_max) {
> +                    fprintf(stderr, "qemu: initrd is too large, cannot support."
> +                            "(max: %"PRIu32", need %"PRId64")\n",
> +                            initrd_max, (uint64_t)initrd_size);
> +                    exit(1);
> +                }
> +
> +                initrd_addr = (initrd_max - initrd_size) & ~4095;
> +
> +                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_ADDR, initrd_addr);
> +                fw_cfg_add_i32(fw_cfg, FW_CFG_INITRD_SIZE, initrd_size);
> +                fw_cfg_add_bytes(fw_cfg, FW_CFG_INITRD_DATA, initrd_data,
> +                                 initrd_size);
> +            }
> +
>               return;
>           }
>           /* This looks like a multiboot kernel. If it is, let's stop
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2019-01-15  9:52 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-12-21 20:03 [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot Liam Merwick
2018-12-21 20:03 ` Liam Merwick
2018-12-21 20:03 ` [Qemu-devel] [RFC v2 1/4] elf: Add optional function ptr to load_elf() to parse ELF notes Liam Merwick
2018-12-21 20:03   ` Liam Merwick
2019-01-02 13:06   ` Stefan Hajnoczi
2019-01-02 13:06   ` [Qemu-devel] " Stefan Hajnoczi
2019-01-08 14:47     ` Liam Merwick
2019-01-08 14:47       ` Liam Merwick
2018-12-21 20:03 ` [Qemu-devel] [RFC v2 2/4] elf-ops.h: Add get_elf_note_type() Liam Merwick
2018-12-21 20:03   ` Liam Merwick
2019-01-02 13:12   ` [Qemu-devel] " Stefan Hajnoczi
2019-01-02 13:12     ` Stefan Hajnoczi
2019-01-08 14:47     ` [Qemu-devel] " Liam Merwick
2019-01-08 14:47       ` Liam Merwick
2018-12-21 20:03 ` [Qemu-devel] [RFC v2 3/4] pvh: Add x86/HVM direct boot ABI header file Liam Merwick
2018-12-21 20:03   ` Liam Merwick
2018-12-21 20:03 ` [Qemu-devel] [RFC v2 4/4] pvh: Boot uncompressed kernel using direct boot ABI Liam Merwick
2018-12-21 20:03   ` Liam Merwick
2019-01-02 13:18   ` [Qemu-devel] " Stefan Hajnoczi
2019-01-02 13:18     ` Stefan Hajnoczi
2019-01-08 14:48     ` [Qemu-devel] " Liam Merwick
2019-01-08 14:48       ` Liam Merwick
2018-12-26 17:08 ` [Qemu-devel] [RFC v2 0/4] QEMU changes to do PVH boot no-reply
2018-12-26 17:08   ` no-reply
2019-01-03 17:22 ` Stefano Garzarella
2019-01-03 17:22   ` Stefano Garzarella
2019-01-08 14:47   ` [Qemu-devel] " Liam Merwick
2019-01-08 14:47     ` Liam Merwick
2019-01-09 11:53     ` [Qemu-devel] " Stefano Garzarella
2019-01-09 11:53       ` Stefano Garzarella
2019-01-09 19:53       ` [Qemu-devel] " Boris Ostrovsky
2019-01-09 19:53         ` Boris Ostrovsky
2019-01-09 21:18         ` [Qemu-devel] " Maran Wilson
2019-01-09 21:18           ` Maran Wilson
2019-01-10 15:12           ` [Qemu-devel] " Stefano Garzarella
2019-01-10 15:12             ` Stefano Garzarella
2019-01-15  9:51             ` [Qemu-devel] " Liam Merwick
2019-01-15  9:51               ` Liam Merwick

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.